Connector Improvement: Enable Filtering on Github COMMIT_FILE table
PlannedThe COMMIT_FILE table in GitHub contains information per commit, per file, about all changes in the commit to the file in question. One problem with this is that many companies do not configure their GitHub repositories with .gitignore files that remove inessential files, such as files that are autogenerated by IDEs, various configuration files, and so on. These files are very chatty, changing all the time, and yet provide no analytical value and cause MAR costs to be very high.
We would ask that when the connector receives the JSON back from the COMMIT_FILE API endpoint, it first applies a filter, specified appropriately via configuration of the connector, and only returns records matching the filter. This could be a general capability for any connector that works with JSON data from API sources.
An example of a simple filter (conceptually) on the GitHub COMMIT_FILE data would be to only return records pertaining to files that are have "src" in their file paths. Often there will be a "src" folder in a repository that contains the real source code, which is truly of interest analytically.
Thank you for anything you can do on this issue.
-
Official comment
Hi William,
At this time we do not have the ability to filter data for Github connector, but we are working on new features to allow filtering for connectors that is rolling out gradually. I will update this request when filtering is available for the Github connector.
Frank
Please sign in to leave a comment.
Comments
1 comment