Mining passwords from public GitHub repositories

I was on GitHub today, and I had a thought about mining database, account and server passwords of public repositories where the developer has forgotten to remove the password from the source code before pushing to the public repository.

I did a simple test using GitHub’s search using certain keywords eg:

It only takes you to go through about 10 pages of search results (“root password” has over 10,000 results) and you can see a few password’s that look like real. GitHub do have an article about remove sensitive data (http://help.github.com/removing-sensitive-data/) but also has a good statement line saying “Once the commit has been pushed you should consider the data to be compromised. Period.” which is very true but it seems there are alot of developers out there that our committing there passwords. I wonder how many hackers have prowled through GitHub looking for passwords and in result successfully been able to pull of an attack.

However, the best search term is “gmail password” (http://github.com/search?type=Code&language=&q=gmail+password&repo=&langOverride=&x=0&y=0&start_value=1) which as you can see, the first result looks like a real gmail password. I haven’t tested any of these passwords but I’m sure there is plenty of real passwords that developers have committed.

So remember, DON’T COMMIT YOUR PASSWORDS!

Tags: , ,

7 Responses to “Mining passwords from public GitHub repositories”

  1. opt9 says:

    Why not “twitter password” ?
    page 4 is very interesting.

  2. [...] and moving your own projects to GitHub if you’re still self-hosting. Just, uh, don’t commit passwords. [...]

  3. Matt says:

    What is the best way to programmatically prevent your passwords from being committed? Do you need to add all of your files to your index and then remove anything with a password in it? Thanks! Matt, Leanfounder.com

  4. Gonzalo says:

    Using github or similar services is great. We can backup our code remotely easily. But as you show in your post is really easy to forget to exclude our configuration files from commits.
    I like to store all password in one file, exclude the file in the version control system (I use mercurial and there’s a .hgignore file) and I create a dummy file with dummy passwords with .dist extension.
    But it’s really easy to forget it and if you commit once your passwords are exposed for ever (even if you commit again)

  5. Ben Maynard says:

    Gonzalo is correct, in my global .gitignore file I have added:

    config.php
    configuration.py

    They are the 2 file names that I deal with that have configuration settings, so I will never commit the files! But it is also a good idea when you setup and new project and importing all the resources that you look at what files are being committed just in case :)

    I will also add a config.dist.php and configuration.dist.py to source control and if I add a new setting to any of those files, I will then also add it into the .dist file.

  6. Matt Dyor says:

    Thanks for the response. Storing all passwords in one file makes sense for a couple of reasons (and including the .dist corollary so that people/you will know what the format looks like…without the passwords:). Thanks for the pointer! Matt

  7. radio control warbird

    Mining passwords from public GitHub repositories « Ben’s blog about anything

Leave a Reply

You must be logged in to post a comment.