Google Will Pay You to Find Weaknesses in Its AI

Not everything will be eligible for a payout, though.

Generative AI is cool, but it can also be dangerous if used improperly. That's why AI models are trained to reject certain, more dangerous kinds of requests. Except that if you get a little clever, you might be able to convince the AI to disregard its guidelines and comply with questionable requests using more creative prompts. Now, Google wants to teach its AI some manners. It's offering to pay people who convince Bard to do something bad.

Google's vulnerability rewards program, which rewards users who are able to find vulnerabilities and weaknesses in the code within its software (both apps and operating systems), is expanding to include Bard and questionable prompts. If you happen to be able to twist around a prompt enough to get Bard to do something bad that it's not supposed to be able to do (known as a prompt injection attack), Google might pay you a sum of money. The VRP also covers other kinds of attacks that can be performed on Bard, such as training data extraction, where you successfully get an AI to give you sensitive data, such as personally identifiable information and passwords.

Google already has a different (non-paying) reporting channel for factually incorrect/weird responses and the like. The company will only pay for things that can be exploited by a hacker for malicious purposes. So, if you manage to convince the AI to say slurs, give you Windows keys, or say that it will kill you, it's probably not within Google's bounty program. Google also says that it won't pay for issues related to copyright issues or non-sensitive data extraction, but other than this, you might be able to get thousands of dollars from a report depending on how bad it actually is.

By treating these kinds of issues as vulnerabilities and including them in its bounty program, Google hopes to be able to greatly strengthen its AI and make it adhere to its code of ethics and guidelines as well as possible. We also expect Google to pay a lot of money to users from this. Finding weaknesses within an AI model by throwing prompts at it and seeing if they stick is way different from reading through code, identifying an opening, and seeing how to get through it.

If this is something you're interested in, make sure to check out Google's guidelines for reporting issues on AI products, so you can know what's in scope and what's not.

Source: Google via TechCrunch

Source

Adenman
1

User Feedback

0 Comments

Recommended Comments

There are no comments to display.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Add a comment...

× Pasted as rich text. Paste as plain text instead

Only 75 emoji are allowed.

× Your link has been automatically embedded. Display as a link instead

× Your previous content has been restored. Clear editor

× You cannot paste images directly. Upload or insert images from URL.

Insert image from URL

Sign In

Google Will Pay You to Find Weaknesses in Its AI

User Feedback

Recommended Comments

Join the conversation

Recently Browsing 0 members

nsane.down

News

Browse

Activity