Last September, we launched the Early Access Program of GitHub Integrations, a new option for extending GitHub. We’ve recently added some new features and moved Integrations into pre-release so you can begin using it within your production workflows. Here’s a summary of the latest features. You can learn more about what’s changed from our Developer Blog.
Users can now log in with your Integration using the OAuth protocol, allowing you to identify users and display data to them from the relevant installations. Additionally, an Integration can now make authorized API requests on behalf of a user; for example, to deploy code or create an issue. Learn more about authenticating as a user via an Integration.
When you create an Integration, you have to specify which permissions it needs; for example, the ability to read issues or create deployments. Now you can update the requested permissions via Settings > Developer settings > Integrations, whenever the needs of your Integration change. Users will be prompted to accept these changes and enable the new permissions on their installation.
Finally, you now have the option to configure a Setup URL to which you can redirect users after they install your integration if any additional setup is required on your end.
Today we’re announcing the next major release of Git LFS: v2.1.0, including new features, performance improvements, and more.
With Git LFS 2.1.0, get a more comprehensive look at which files are marked as modified by running the
git lfs status command.
Git LFS will now tell you if your large files are being tracked by LFS, stored by Git, or a combination of both. For instance, if LFS sees a file that is stored as a large object in Git, it will convert it to an LFS pointer on checkout which will mark the file as modified. To diagnose this, try
git lfs status for a look at what’s going on:
On branch master Git LFS objects to be committed: converted-lfs-file.dat (Git: acfe112 -> LFS: acfe112) new-lfs-file.dat (LFS: eeaf82b) partially-staged-lfs-file.dat (LFS: a12942e) Git LFS objects not staged for commit: unstaged-lfs-file.dat (LFS: 1307990 -> File: 8735179) partially-staged-lfs-file.dat (File: 0dc8537)
Git LFS 2.1.0 introduces support for URL-style configuration via your
.lfsconfig. For settings that apply to URLs, like
lfs.locksverify, you can now scope them to a top-level domain, a root path, or just about anything else.
To better understand and debug network requests made by Git LFS, version 2.1.0 introduces a detailed view via the
GIT_LOG_STATS=1 environment variable:
$ GIT_LOG_STATS=1 git lfs pull Git LFS: (201 of 201 files) 1.17 GB / 1.17 GB $ cat .git/lfs/objects/logs/http/*.log concurrent=15 time=1493330448 version=git-lfs/2.1.0 (GitHub; darwin amd64; go 1.8.1; git f9d0c49e) key=lfs.batch event=request url=https://github.com/user/repo.git/info/lfs/objects/batch method=POST body=8468 key=lfs.batch event=request url=https://github.com/user/repo.git/info/lfs/objects/batch method=POST status=200 body=59345 conntime=47448309 dnstime=2222176 tlstime=163385183 time=491626933 key=lfs.data.download event=request url=https://github-cloud.s3.amazonaws.com/... method=GET status=200 body=4 conntime=58735013 dnstime=6486563 tlstime=120484526 time=258568133 key=lfs.data.download event=request url=https://github-cloud.s3.amazonaws.com/... method=GET status=200 body=4 conntime=58231460 dnstime=6417657 tlstime=122486379 time=261003305 # ...
The Git LFS API has long supported an
expires_at property in both SSH authenticate as well as Batch API responses. This introduced a number of issues where an out-of-sync system clock would cause LFS to think that objects were expired when they were still valid. Git LFS 2.1.0 now supports an
expires_in property to specify a duration relative to your computer’s time to expire the object.
The LFS team is working on a migration tool to easily migrate your existing Git repositories with large objects into LFS without the need to write a
git filter-branch command. We’re also still inviting your feedback on our File Locking feature.
That was a quick overview of some of the larger changes included in this release. To get a more detailed look, check out the release notes.
A few weeks ago, researchers announced SHAttered, the first collision of the SHA-1 hash function. Starting today, all SHA-1 computations on GitHub.com will detect and reject any Git content that shows evidence of being part of a collision attack. This ensures that GitHub cannot be used as a platform for performing collision attacks against our users.
This fix will also be included in the next patch releases for the supported versions of GitHub Enterprise.
Git stores all data in “objects.” Each object is named after the SHA-1 hash of its contents, and objects refer to each other by their SHA-1 hashes. If two distinct objects have the same hash, this is known as a collision. Git can only store one half of the colliding pair, and when following a link from one object to the colliding hash name, it can’t know which object the name was meant to point to.
Two objects colliding accidentally is exceedingly unlikely. If you had five million programmers each generating one commit per second, your chances of generating a single accidental collision before the Sun turns into a red giant and engulfs the Earth is about 50%.
If a Git fetch or push tries to send a colliding object to a repository that already contains the other half of the collision, the receiver can compare the bytes of each object, notice the problem, and reject the new object. Git has implemented this detection since its inception.
However, SHA-1 names can be assigned trust through various mechanisms. For instance, Git allows you to cryptographically sign a commit or tag. Doing so signs only the commit or tag object itself, which in turn points to other objects containing the actual file data by using their SHA-1 names. A collision in those objects could produce a signature which appears valid, but which points to different data than the signer intended. In such an attack the signer only sees one half of the collision, and the victim sees the other half.
The recent attack cannot generate a collision against an existing object. It can only generate a colliding pair from scratch, where the two halves of the pair are similar but contain a small section of carefully-selected random data that differs.
An attack therefore would look something like this:
Generate a colliding pair, where one half looks innocent and the other does something malicious. This is best done with binary files where humans are unlikely to notice the difference between the two halves (the recent attack used PDFs for this purpose).
Convince a project to accept your innocent half, and wait for them to sign a tag or commit that contains it.
Distribute a copy of the repository with the malicious half (either by breaking into a hosting server and replacing the innocent object on disk, or hosting it elsewhere and asking people to verify its integrity based on the signatures). Anybody verifying the signature will think the contents match what the project owners signed.
Generating a collision via brute-force is computationally too expensive, and will remain so for the foreseeable future. The recent attack uses special techniques to exploit weaknesses in the SHA-1 algorithm that find a collision in much less time. These techniques leave a pattern in the bytes which can be detected when computing the SHA-1 of either half of a colliding pair.
GitHub.com now performs this detection for each SHA-1 it computes, and aborts the operation if there is evidence that the object is half of a colliding pair. That prevents attackers from using GitHub to convince a project to accept the “innocent” half of their collision, as well as preventing them from hosting the malicious half.
The actual detection code is open-source and was written by Marc Stevens (whose work is the basis of the SHAttered attack) and Dan Shumow. We are grateful for their work on that project.
Not yet. Git’s object names take into account not only the raw bytes of the files, but also some Git-specific header information. The PDFs provided by the SHAttered researchers collide in their raw bytes, but not when added to a Git repository. The same technique could be used to generate a Git object collision, but like the generation of the original SHAttered PDFs, it would require spending hundreds of thousands of dollars in computation.
Blocking collisions that pass through GitHub is only the first step. We’ve already been working with the Git project to include the collision detection library upstream. Future versions of Git will be able to detect and reject colliding halves no matter how they reach the developer: fetching from other hosting sites, applying patches, or generating objects from local data.
The Git project is also developing a plan to transition away from SHA-1 to another, more secure hash algorithm, while minimizing the disruption to existing repository data. As that work matures, we plan to support it on GitHub.
In honor of our Bug Bounty Program’s third birthday, we kicked off a promotional bounty period in January and February. In addition to bonus payouts, the scope of the bug bounty was expanded to include GitHub Enterprise. It may come as no surprise that including a new scope meant that the most severe bugs were all related to the newly included target.
There was no shortage of high-quality reports. Picking winners is always tough, but below are the intrepid researchers receiving extra bounties.
The first prize bonus of $12,000 goes to @jkakavas for their GitHub Enterprise SAML authentication bypass report which allowed an attacker to construct a SAML response that could arbitrarily set the authenticated user account. You can read more about the story on their blog.
The second prize bonus of $8,000 goes to @iblue for their remote code execution bug found in the GitHub Enterprise management console. This was due to a static secret mistakenly being used to cryptographically sign the session cookie. The static secret was intended to only be used for testing and development. However, an unrelated change of file permissions prevented the intended (and randomly generated) session secret from being used. By knowing this secret, an attacker could forge a cookie that is deserialized by
Marshal.load, leading to remote code execution.
The third prize bonus of $5,000 goes to @soby for their report of another GitHub Enterprise SAML authentication bypass. This attack showed it was possible to replay a SAML response to have our SAML implementation use unsigned data in determining which user account was authenticated.
The best report bonus of $5,000 goes to @orangetw for their report of Server-Side Request Forgery. This report involved chaining together four vulnerabilities to deliver requests to internal services that end up executing attacker-controlled code. The reporter supplied a clear explanation of the problem through each step and included proof-of-concept scripts for the entire journey. While this report did not earn a prize for being the most severe, it is exactly the type of report we want to encourage reporters to submit.
A theme we’ve seen continue over the years is paying out for bugs that aren’t in our own code but in browsers. 2016 was no exception with @filedescriptor’s report of funky Internet Explorer behavior detailing how a triple-encoded host value in a URL is handled in redirects.
Lastly, Unicode gonna Unicode. @jagracy found a way to exploit the way Unicode was normalized in our code and in our database engine to deliver password reset emails to entirely different addresses than what were intended.
Our “Two Years of Bounties” post has detailed stats for our submissions during the first two years of the program. These years saw a total payout of $95,300 across 102 submissions out of a total of 7,050 submissions (1.4% validity rate). So far in the third year of the program, we have paid out for 73 submissions for a total of $81,700. Many of these reports fall into our “$200 thank you” bucket. These are issues that we do not consider severe enough for an immediate fix, but that we still want to reward our researchers for. Forty-eight of these issues were deemed high enough risk to warrant a write-up on https://bounty.github.com. We saw a total of 48 out of 795 valid reports, bringing our validity rate to 6%.
In 2016, we saw a slight decrease in the number of reports compared to the average of the previous two years. While we can’t know for certain, we suspect this is due to the ever-decreasing presence of “low hanging fruit.” Unfortunately, we saw an increase in the number of “critical” reports. All other categories saw a decrease in the number of reports.
Out of the accepted reports, we saw some notable changes in the type of vulnerability reports submitted. Many categories such as XSS, Injection, and CSRF saw a decrease in reports. Notably, the number of valid CSRF reports for 2016 was zero. At the same time, we saw an increase in session handling bugs, sensitive data exposure, and missing function level access controls.
In April of 2016, we transitioned to the HackerOne platform. The graphs will not include all data for the year, unfortunately, but the data is still interesting. Notably, even though we had run a public bounty for almost 2.4 years, we still experienced a large spike upon announcing the transition, just like many programs’ initial public launch.
We’ve also been able to track new data via the platform. For example, our average response was about 16 hours and our time to resolution was about 28 days. Tracking this data and keeping it within acceptable bounds will help ensure our program continues to run smoothly and efficiently.
We continue to see rewards being donated to charities. We absolutely love donating bounties, and we match all contributions. This year saw donations to Doctors without Borders and the Electronic Frontier Foundation (EFF).
We also sponsored an event that was aimed at helping people from under-represented backgrounds participate in bug bounties. The h1-415 event saw attendance from groups like Hack the Hood, Women in Security and Privacy (WISP), FemHacks, Lairon College Cyber Patriots, and /dev/color. Hack the Hood was nice enough to make a video from the event.
In 2016, we’ve learned a lot about running a bug bounty program and we’ve continued to uncover sharp edges of our services and codebase. Our program will continue to evolve to engage and support the community of bounty hunters that have made this program successful. We look forward to your submission in our fourth year of the program!
Best regards and happy hacking,
Starting today, all Markdown user content hosted in our website, including user comments, wikis, and
.md files in repositories will be parsed and rendered following a formal specification for GitHub Flavored Markdown. We hope that making this spec available will allow third parties to better integrate and preview GFM in their software.
The full details of the specification are available in our Engineering Blog.
This project is based on CommonMark, a joint effort to specify and unify the different Markdown dialects that are currently available. We’ve updated the original CommonMark spec with formal definitions for the custom Markdown features that are commonly used in GitHub, such as tables, task lists, and autolinking.
Together with the specification, we’re also open-sourcing a reference implementation in C, based on the original
cmark parser, but with support for all the features that are actively used in GitHub. This is the same implementation we use in our backend.
A lot of care and research has been put into designing the CommonMark spec to make sure it specifies the syntax and semantics of Markdown in a way that represents the existing real-world usage.
Because of this, we expect that the vast majority of Markdown documents on GitHub will not be affected at all by this change. Some documents sporting the most arcane features of Markdown may render differently. Check out our extensive GitHub Engineering blog post for more details on what has changed.