Ou Have Triggered an Abuse Detection Mechanism. Please Wait a Few Minutes Before You Try Again.

Matt Martz

Using GitHub's API to search for code references beyond multiple organizations

Background

As part of a modernization projection nosotros're trying to divide upwards our ~900 table MySQL database into much fewer DynamoDB tables. In order to evaluate this, we need to clarify the affect to our code. Easy enough, right? Well... we have two GitHub organizations with a combined 800 repos to wait through.

To help with this, I wrote a script to practise code searches with GitHub'southward v3 API.

shock

The v4 GraphQL API doesn't back up code search every bit of this writing.

Pre-reqs

You'll need a personal access token with Read access defined in your surround variables under GITHUB_CREDENTIALS_PSW.

I take Node version v12.xiii.1 installed.

Example code is here:

There are only 3 dependencies installed into the project... Node types, typescript and axios.

A search.json file also needs to be created. The JSON object includes a list of code strings to search for and the GitHub organizations to search through.

In the example I just did a basic search for password across a couple larger organizations... don't read into that... I just knew information technology'd be a common thing mentioned in at to the lowest degree one place.

In my actual use instance the listing of code strings I used were the table names. For 32 tables x 2 organizations this will be 64 API calls (which is over the authenticated user'due south charge per unit limit of 30 / infinitesimal... more on that afterwards.

The Script

The script is fairly straight-forwards. Start I do the imports and initialize the reporting object.

                          const              fs              =              crave              (              "              fs              "              );              const              axios              =              require              (              "              axios              "              );              const              searchData              =              require              (              "              ./search.json              "              );              // personal access token stored in env              const              githubReadApiKey              =              process              .              env              .              GITHUB_CREDENTIALS_PSW              ;              const              findings              :              any              =              {              repos              :              {},              lawmaking              :              {},              };                      

Enter fullscreen manner Exit fullscreen mode

So I have some helper functions... this only makes setTimeout a Hope so I tin can use it asynchronously after.

                          async              role              sleep              (              ms              :              number              )              {              return              new              Promise              ((              resolve              )              =>              setTimeout              (              resolve              ,              ms              ));              }                      

Enter fullscreen mode Exit fullscreen mode

This getRateLimit part isn't specifically used but is useful for testing. Authenticated API calls are immune to make 30 searches per minute... BUT information technology turns out GitHub also does abuse detection.

                          async              function              getRateLimit              ()              {              return              axios              .              become              (              "              https://api.github.com/rate_limit              "              ,              {              headers              :              {              Authority              :              `token                            ${              githubReadApiKey              }              `              ,              },              });              }                      

Enter fullscreen mode Exit fullscreen way

The searchCode function is the chief API call that does the searching. I had to build in some multiple-try / wait code due to GitHub responding with Y'all accept triggered an abuse detection machinery. Please wait a few minutes earlier you try again. on occasion (even when nether the API rate limit for searching). Fortunately their docs include a fashion around this: https://developer.github.com/v3/guides/best-practices-for-integrators/#dealing-with-abuse-rate-limits

denied

Me to GitHub

The response includes a retry-later header... which the script detects and waits for that time (typically a minute) + 1 second.

                          async              function              searchCode              (              codeStr              :              string              ,              org              ?:              string              ):              Hope              <              SearchResults              |              naught              >              {              const              orgStr              =              org              ?              `+org:              ${              org              }              `              :              ""              ;              const              attempts              =              2              ;              for              (              let              attempt              =              0              ;              attempt              <              attempts              ;              effort              ++              )              {              try              {              const              res              =              await              axios              .              get              (              `https://api.github.com/search/lawmaking?q=              ${              encodeURIComponent              (              codeStr              )}${              orgStr              }              &per_page=100`              ,              {              validateStatus              :              function              ()              {              return              true              ;              },              headers              :              {              Authorization              :              `token                            ${              githubReadApiKey              }              `              ,              },              }              )              .              catch              ((              east              :              Error              )              =>              {              panel              .              fault              (              e              );              });              if              (              res              .              status              >              200              )              {              console              .              log              (              res              .              information              .              message              );              const              retryAfter              =              parseInt              (              res              .              headers              [              "              retry-later on              "              ]);              console              .              log              (              `Sleeping for                            ${              retryAfter              +              1              }                              seconds before trying once more...`              );              await              sleep              ((              retryAfter              +              1              )              *              chiliad              );              }              else              {              render              res              .              data              ;              }              }              take hold of              (              due east              )              {              console              .              error              (              e              );              }              }              // shouldn't get hither...              return              Promise              .              resolve              (              null              );              }                      

Enter fullscreen mode Exit fullscreen mode

The results are so divide up into what I think are useful metrics...

                          async              function              processResults              (              results              :              any              ,              codeStr              :              string              ,              org              ?:              cord              )              {              console              .              log              (              `              ${              codeStr              }              :                            ${              results              .              items              .              length              }                              -                            ${              results              .              total_count              }              `              );              const              items              =              results              .              items              ;              findings              .              code              [              codeStr              ].              count              =              findings              .              lawmaking              [              codeStr              ].              count              +              results              .              total_count              ;              items              .              forEach              ((              item              :              any              )              =>              {              if              (              findings              .              code              [              codeStr              ].              repos              .              indexOf              (              item              .              repository              .              full_name              )              ===              -              one              )              {              findings              .              lawmaking              [              codeStr              ].              repos              .              push              (              item              .              repository              .              full_name              );              findings              .              code              [              codeStr              ].              repoCount              =              findings              .              lawmaking              [              codeStr              ].              repos              .              length              ;              }              if              (              Object              .              keys              (              findings              .              repos              ).              indexOf              (              detail              .              repository              .              full_name              )              ===              -              one              )              {              findings              .              repos              [              particular              .              repository              .              full_name              ]              =              {              paths              :              [              {              path              :              item              .              path              ,              score              :              item              .              score              ,              url              :              particular              .              html_url              ,              },              ],              code              :              {},              codeCount              :              1              ,              };              findings              .              repos              [              item              .              repository              .              full_name              ].              code              [              codeStr              ]              =              1              ;              }              else              {              findings              .              repos              [              item              .              repository              .              full_name              ].              paths              .              push              ({              path              :              particular              .              path              ,              score              :              detail              .              score              ,              url              :              item              .              html_url              ,              });              if              (              Object              .              keys              (              findings              .              repos              [              item              .              repository              .              full_name              ].              code              ).              indexOf              (              codeStr              )              ===              -              1              )              {              findings              .              repos              [              item              .              repository              .              full_name              ].              lawmaking              [              codeStr              ]              =              one              ;              findings              .              repos              [              item              .              repository              .              full_name              ].              codeCount              =              Object              .              keys              (              findings              .              repos              [              item              .              repository              .              full_name              ].              code              ).              length              ;              }              else              {              findings              .              repos              [              item              .              repository              .              full_name              ].              code              [              codeStr              ]              =              findings              .              repos              [              item              .              repository              .              full_name              ].              lawmaking              [              codeStr              ]              +              1              ;              }              }              findings              .              repos              [              item              .              repository              .              full_name              ].              pathCount              =              findings              .              repos              [              particular              .              repository              .              full_name              ].              paths              .              length              ;              });              }                      

Enter fullscreen mode Exit fullscreen mode

Finally... I use an async part to run all of these. I flatten the searches into one list, do the api calls in series with some post-processing and write it to an output.json file.

                          async              function              main              ()              {              console              .              log              (              "              Starting Search...              "              );              const              flattenedSearches              :              cord              [][]              =              [];              searchData              .              codeStrings              .              forEach              ((              codeStr              :              string              )              =>              {              searchData              .              organizations              .              forEach              ((              org              :              string              )              =>              flattenedSearches              .              push              ([              codeStr              ,              org              ])              );              });              for              (              let              searchInd              =              0              ;              searchInd              <              flattenedSearches              .              length              ;              searchInd              ++              )              {              const              search              =              flattenedSearches              [              searchInd              ];              const              searchResults              =              look              searchCode              (              search              [              0              ],              search              [              i              ]);              findings              .              code              [              search              [              0              ]]              =              {              count              :              0              ,              repos              :              [],              repoCount              :              0              ,              };              processResults              (              searchResults              ,              search              [              0              ],              search              [              1              ]);              }              findings              .              priority              =              {              repos              :              Object              .              keys              (              findings              .              repos              ).              sort              (              (              a              ,              b              )              =>              findings              .              repos              [              b              ].              pathCount              -              findings              .              repos              [              a              ].              pathCount              ),              code              :              Object              .              keys              (              findings              .              code              ).              sort              (              (              a              ,              b              )              =>              findings              .              code              [              b              ].              count              -              findings              .              code              [              a              ].              count              ),              };              permit              data              =              JSON              .              stringify              (              findings              ,              null              ,              2              );              fs              .              writeFileSync              (              "              output.json              "              ,              data              );              console              .              log              (              "              Search Consummate!              "              );              }              main              ().              take hold of              ((              e              )              =>              console              .              mistake              (              e              ));                      

Enter fullscreen mode Exit fullscreen fashion

The Output

The output breaks things downward into the repositories that were plant, the code and a prioritization of what to look at (based on occurence.

  • repos... the repos constitute
    • repos.<repo>.code... what lawmaking was in them
    • repos.<repo>.codeCount and repos.<repo>.pathCount some basic counts for readability
  • code... the original search terms
    • code.<code>.repos... the repos found for that code
    • code.<code>.count... a count of "mentions"
    • code.<code>.repoCount... the number of repos that code was found in
  • priority... prioritized lists of what to look at by count (start at the top of the listing)

I by and large detect this to exist enough data to do some further postal service-processing on to generate graphs such as this (from my actual data):

Alt Text

Well... two tables spread across 24/800 repos isn't *that bad* I guess...

Unfortunately, this only searches for table names... not actual USAGE of them... so there's all the same a lot of data to go through.

escalate

If you have any useful tools for doing this type of refactoring analysis, let me know in the comments.

kimbittly.blogspot.com

Source: https://dev.to/martzcodes/using-github-s-api-to-search-for-code-references-across-multiple-organizations-337l

0 Response to "Ou Have Triggered an Abuse Detection Mechanism. Please Wait a Few Minutes Before You Try Again."

Publicar un comentario

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel