1.2 KiB
1.2 KiB
PIG: Privacy Jailbreak Attack on LLMs via Gradient-based Iterative In-Context Optimization
This repository contains the official code implementation of our paper:
Setup
First, create a virtual environment using Anaconda:
conda create -n pig python=3.9.19
conda activate pig
Second, you need to install the necessary dependencies:
pip install -r requirements.txt
Usage
You can run a privacy jailbreak attack using the following steps:
- First, modify parameters such as
dataset,target_modelorattack_modelin scriptrun.sh. - Then, execute the privacy jailbreak attack by running
bash run.sh. - Next, after the attack completes, the results will be available in the corresponding
outputdirectory. - Finally, evaluate the results using
python eval.pyto compute various metrics such as the ASR.
Acknowledgements
Our PIG framework is based on EasyJailbreak. We thank the team for their open-source implementation.
