Vulnyzer
Overview
Hey there! Today, I want to share a cool project or (workaround maybe lol) I worked on called Vulnyzer. It’s a Streamlit-based app that lets you interact with different AI models like GPT-4 and GPT-3.5. The best part? I customized it to unlock some advanced features using GitHub Copilot. Let’s walk through how I did it step-by-step!
Key Features
- Model Selection: Pick between GPT-4, GPT-3.5, or input a custom model name.
- Document Handling: Upload and extract text from PDF documents.
- Video Processing: Get transcripts from YouTube videos.
- Dynamic Interaction: Test interactions with AI models using custom prompts and settings.
How I Built It
1. Customizing GitHub Copilot Extension to Get Auth Token
The first step was to tweak the GitHub Copilot extension to get the auth token. This token is like a golden key that lets you talk to the Copilot servers directly. Here’s how I did it:
Modifying extension.js
The first step was to dive into the extension code and figure out how I could customize it. I realized that by making some changes to the extension.js file, I could make Copilot use the GPT-4 model instead of the default GPT-3.5. This would give me access to even more advanced reasoning capabilities. also by just by tweaking some part of that big extension.js file you can even show the token you use to generate tif in the output of vscode! This token looks something like gho-gsZCklbUs...
But wait, there’s more! I also tried to expand the max token limit from 8000 to a whopping 32000 tokens. This would allow Copilot to analyze and process even more code in a single file. This made the accuracy of the code it generted even more than before.
Using the Extracted Token to Send Requests to the Model 🔑
After extracting the token, the next step was to use it to send requests to the GPT-4 model on GitHub’s servers. Here’s how I did it: Fetching the Token
First, I needed a way to retrieve the token stored in localStorage. I wrote a Python function to get this token:
import requests
def get_token():
try:
url = "https://api.github.com/copilot_internal/v2/token"
headers = {
"Authorization": "token gho_asew",
"Editor-Version": "vscode/1.83.0",
"Editor-Plugin-Version": "copilot-chat/0.8.0"
}
response = requests.get(url, headers=headers)
if response.status_code == 200:
json = response.json()
if 'token' in json:
return json['token']
else:
return {"error": f"Received {response.status_code} HTTP status code"}
except Exception as e:
return {"error": str(e)}
token = get_token()
print(token)
This function makes a request to the GitHub API endpoint to fetch the token using the headers we modified earlier.
Sending Requests to the Model
With the token in hand, I could now send requests straight up to the GPT-4 model on the backend of github api. Here’s the code for reference:
import httpx
import asyncio
async def openai_agent_test(messages, model="gpt-4", temperature=0.5, stream=True):
token = get_token() # Assume the token is fetched by the get_token() function
async with httpx.AsyncClient() as client:
r = await client.post(
"https://api.githubcopilot.com/chat/completions",
headers={
"Editor-Version": "vscode/1.83.0",
"Authorization": f"Bearer {token}",
},
json={
"messages": messages,
"model": model,
"temperature": temperature,
},
timeout=130.0
)
if r.status_code != 200:
return "Response 404"
return r.json()["choices"][0]["message"]["content"]
messages = [{"role": "user", "content": "Hello, how are you?"}]
assistant_response = asyncio.run(openai_agent_test(messages))
print(assistant_response)
This code sends a message to the GPT-4 model using the token and returns the model’s response. Pretty neat, right?
Note
I haven’t explained in detail how to get the exact gho-gsZCklbUs... token by modifying extension.js file because it’s still a bit of a gray area and involves some workarounds. This walkthrough is for educational purposes only and to give you an idea of how things work behind the scenes.
Building the Frontend with Streamlit
Alright, now let’s dive into how I built the frontend using Streamlit. This part is super fun because it lets you create a user interface that interacts with the AI models. I’ll walk you through the steps and show you some code snippets to help you understand how everything works.
Setting Up Streamlit
First things first, you need to set up Streamlit. If you haven’t installed it yet, you can do so by running:
pip install streamlit
Once you have Streamlit installed, create a new Python file, let’s call it copilot.py as this is same name I’ve used in github repo. This file will contain all the code for your Streamlit app.
Creating the Sidebar
import streamlit as st
# Setting page title and header
st.set_page_config(page_title="Vulnyzer", page_icon=":robot_face:")
# Sidebar for model selection
default_models = ["gpt-4", "gpt-3.5", "Custom"]
selected_model = st.sidebar.selectbox("Select Model", default_models)
# If 'Custom' is selected, show an additional input field for custom model name
if selected_model == "Custom":
model = st.sidebar.text_input("Input Custom Model")
else:
model = selected_model
# Sidebar for role selection
role = st.sidebar.selectbox("Select Role", ["sid", "Tina(DSA)", "Sia (Software Dev)", "Research Mode", "CodeRed (Jailbreak)", "Custom"])
# Sidebar for temperature control
temperature = st.sidebar.slider("Set Temperature", min_value=0.0, max_value=2.0, value=0.6, step=0.1)
This code sets up a basic sidebar where users can select the model, role, and temperature for the AI interactions.
Handling File Uploads
import tempfile
import pdfplumber
import re
# Add PDF upload widget in the sidebar
uploaded_file = st.sidebar.file_uploader("Upload a PDF", type=['pdf'])
# If a PDF is uploaded
if uploaded_file is not None:
# Create a temporary file
with tempfile.NamedTemporaryFile(delete=False) as temp:
# Write the uploaded file's content to the temporary file
temp.write(uploaded_file.getvalue())
# Load the PDF file
pdf = pdfplumber.open(temp.name)
# Initialize an empty string to hold the extracted text
extracted_text = ""
# Loop through each page in the PDF and extract the text
for page in pdf.pages:
extracted_text += page.extract_text()
# Close the PDF file
pdf.close()
# Remove non-alphanumeric characters except for scientific symbols and punctuation
extracted_text = re.sub('[^A-Za-z0-9 .,?!;:+/*^()@#$%&{}[\]<>|\\~`-]', ' ', extracted_text)
# Replace multiple spaces with a single space
extracted_text = re.sub('\s+', ' ', extracted_text).strip()
extracted_text = extracted_text + ". the text above is extracted from a pdf file "
# Format the extracted text as a code snippet
extracted_text = f'```\n{extracted_text}\n```'
# Send the extracted text to the chatbot as user input
if not any(message["content"] == extracted_text for message in st.session_state.messages):
st.session_state.messages.append({"role": "user", "content": extracted_text})
This code uploads a PDF file, extracts the text, and formats it as a code snippet that can be sent to the AI model.
Handling YouTube Transcripts
from youtube_transcript_api import YouTubeTranscriptApi
import urllib.parse as urlparse
# Add YouTube link input field in the sidebar
youtube_link = st.sidebar.text_input("Paste a YouTube link")
def extract_video_id(url):
parsed_url = urlparse.urlparse(url)
if parsed_url.hostname == 'youtu.be':
return parsed_url.path[1:]
if parsed_url.hostname in ('www.youtube.com', 'youtube.com'):
if parsed_url.path == '/watch':
query = urlparse.parse_qs(parsed_url.query)
return query['v'][0]
if parsed_url.path[:7] == '/embed/':
return parsed_url.path.split('/')[2]
if parsed_url.path[:3] == '/v/':
return parsed_url.path.split('/')[2]
raise ValueError("Invalid YouTube URL")
# If a YouTube link is provided
if youtube_link.strip():
video_id = extract_video_id(youtube_link)
transcript = YouTubeTranscriptApi.get_transcript(video_id)
transcript_text = ' '.join([x['text'] for x in transcript])
transcript_text = transcript_text + ". the text above is transcript of a youtube video "
transcript_text = f'```\n{transcript_text}\n```'
if not any(message["content"] == transcript_text for message in st.session_state.messages):
st.session_state.messages.append({"role": "user", "content": transcript_text})
This code extracts the video ID from a YouTube URL, fetches the transcript, and formats it as a code snippet for the AI model.
Interacting with the AI Model
Finally now let’s add the code to interact with the AI model using the extracted token:
import httpx
import asyncio
async def openai_agent_test(messages, model="gpt-4", temperature=0.5, stream=True):
token = get_token() # Assume the token is fetched by the get_token() function
async with httpx.AsyncClient() as client:
r = await client.post(
"https://api.githubcopilot.com/chat/completions",
headers={
"Editor-Version": "vscode/1.83.0",
"Authorization": f"Bearer {token}",
},
json={
"messages": messages,
"model": model,
"temperature": temperature,
},
timeout=130.0
)
if r.status_code != 200:
return "Response 404"
return r.json()["choices"][0]["message"]["content"]
messages = [{"role": "user", "content": "Hello, how are you?"}]
assistant_response = asyncio.run(openai_agent_test(messages))
print(assistant_response)
This code sends a message to the GPT-4 model using the token and returns the model’s response. You can integrate this into your Streamlit app to create a dynamic chat interface.-
Building an Intermediary Server
To make the interaction with the Copilot model more flexible, I set up an intermediary server. This server acts as a bridge between the Copilot server and our custom-made chatbot UI. It helps in tracking usage limits via API keys and can be used in other language model-powered applications.
The Intermediary Server
Here’s the code for our intermediary server, which was built using Node.js and Axios. Below are the important parts of the code, with explanations.
Making a Request
The makeRequest function sends a GET request to our intermediary server with an API key in the headers.
const axios = require('axios');
async function makeRequest() {
try {
const response = await axios.get('https://api-codelite.netlify.app/.netlify/functions/completion', {
headers: {
'api-key': 'godlikemode'
}
});
console.log(response.data);
} catch (error) {
console.error(`Error: ${error}`);
}
}
makeRequest();
Getting the Token
The getToken function fetches a new token from GitHub Copilot’s internal API using the specified headers.
let tokenInfo = {
value: null,
timestamp: Date.now(),
};
async function getToken() {
try {
const url = "https://api.github.com/copilot_internal/v2/token";
const headers = {
"Authorization": "token gho_asew",
"Editor-Version": "vscode/1.83.0",
"Editor-Plugin-Version": "copilot-chat/0.8.0"
};
const response = await axios.get(url, { headers });
if (response.status === 200) {
tokenInfo = {
value: response.data.token,
timestamp: Date.now(),
};
return tokenInfo.value;
}
} catch (error) {
console.error(error);
return { error: error.message };
}
}
API Keys and Limits
Create API keys and usage limits to ensure that each key has a restricted number of uses.
let apiKeys = {
'godlikemode': { count: 0, limit: 100000, messages: [] },
'sp-ea960874-e227-473b-b5b3-37b02023823b': { count: 0, limit: 1000, messages: [] },
// ...other keys
};
Interacting with the AI Model
The openaiAgentTest function interacts with the Copilot model using the token and sends the request to the AI model endpoint.
async function openaiAgentTest(messages, model = "gpt-4", temperature = 0.7) {
// Refresh the token if it's older than 10 minutes
if (!tokenInfo.value || Date.now() - tokenInfo.timestamp > 600 * 1000) {
const newToken = await getToken();
if (newToken.error) {
console.error(`Error refreshing token: ${newToken.error}`);
return;
}
}
try {
const response = await axios({
method: 'post',
url: "https://api.githubcopilot.com/chat/completions",
headers: {
"Editor-Version": "vscode/1.83.0",
"Authorization": `Bearer ${tokenInfo.value}`,
"api-key": process.env.OPENAI_API_KEY,
},
data: {
messages,
model,
temperature,
role: "system", // You can replace "system" with the role you want
},
timeout: 130000
});
if (response.status !== 200) {
throw new Error("Response 404");
}
return response.data.choices[0].message.content;
} catch (error) {
console.error(error);
return { error: error.message };
}
}
Handling Requests
The exports.handler function handles incoming requests, validates API keys, tracks usage, and forwards the messages to the AI model.
exports.handler = async function(event, context) {
const data = JSON.parse(event.body);
const apiKey = event.headers['api-key'];
if (!apiKey || !apiKeys[apiKey]) {
return { statusCode: 403, body: 'Invalid API Key.' };
}
if (apiKeys[apiKey].count >= apiKeys[apiKey].limit) {
return { statusCode: 429, body: 'API Key usage limit exceeded.' };
}
try {
const { messages, model, temperature } = data;
apiKeys[apiKey].messages.push(...messages);
const result = await openaiAgentTest(apiKeys[apiKey].messages, model, temperature);
apiKeys[apiKey].count++;
if (result.error) {
return { statusCode: 500, body: result.error };
}
return { statusCode: 200, body: JSON.stringify(result) };
} catch (error) {
console.error(error);
return { statusCode: 500, body: 'An error occurred while processing your request.' };
}
};
Explanation in short :)
Making a Request: This function sends a GET request to our intermediary server and logs the response.
Getting the Token: This function fetches a new token from GitHub Copilot’s internal API.
API Keys and Limits: We manage API keys and usage limits to ensure controlled access.
Interacting with the AI Model: This function interacts with the Copilot model using the token and sends the request to the AI model endpoint.
Handling Requests: This function handles incoming requests, validates API keys, tracks usage, and forwards the messages to the AI model.
This intermediary server setup allows us to effectively manage interactions with the Copilot model, ensuring secure and controlled access.
Conclusion
This project was an exciting journey into customizing and extending the functionality of GitHub Copilot. By diving into the inner workings of the Copilot extension, I was able to extract tokens and interact with the advanced GPT-4 model, significantly enhancing its capabilities. Building an intermediary server provided a flexible way to manage interactions and API keys, making it possible to use the Copilot model in various applications.
This project not only satisfied my curiosity but also showcased the potential of leveraging existing tools to create powerful and customized solutions. By messing around with the Copilot extension and exploring its possibilities, I stumbled upon some interesting aspects that led me to develop this entire project. This exploit was already reported to GitHub and is mostly fixed, with only a few issues remaining.
Remember, this exploration was purely for educational purposes, and I don’t encourage anyone to bypass any terms of service or engage in activities that may be considered unauthorized or unethical. I’m sharing this with you all to highlight the importance of curiosity and learning in tech.
Happy Exploring!