Skip to main content

GitHub

This page contains the setup guide and reference information for GitHub.

Prerequisites

  • GitHub.com repositories or organizations to sync
  • Either GitHub OAuth enabled by a Daspire administrator, or a GitHub personal access token

This connector currently targets GitHub.com. GitHub Enterprise Server API URLs are not supported by this connector version.

Features

FeatureSupported?
Full Refresh OverwriteYes
Full Refresh AppendYes
Incremental Sync AppendYes
Incremental Sync Append + DedupedYes

Setup guide

Step 1: Choose an authentication method

You can authenticate with GitHub OAuth or a personal access token.

For OAuth, a Daspire Owner or Administrator must first configure the deployment-wide GitHub OAuth app credentials in the GitHub source form. Register the callback URL shown in Daspire in your GitHub OAuth app settings before users authenticate.

For a personal access token:

  1. Sign in to your GitHub account.

  2. Go to Settings -> Developer settings -> Personal access tokens page.

  3. Click Generate new token, select scopes which define the access for the token, and click Generate token. GitHub personal access token

NOTE: Use the scopes repo, read:org, read:repo_hook, read:user, read:discussion, and workflow for full private repository, organization, hook, discussion, and workflow stream coverage. Depending on which streams you want to sync, the user generating the token needs the corresponding GitHub permissions:

  • For syncing Collaborators, the user which generates the personal access token must be a collaborator. To become a collaborator, they must be invited by an owner. If there are no collaborators, no records will be synced. Read more about access permissions here.
  • For syncing Teams is only available to authenticated members of a team's organization. Personal user accounts and repositories belonging to them don't have access to Teams features. In this case no records will be synced.
  • For syncing Projects, the repository must have the Projects feature enabled.
  1. Save your access token for later use.

Step 2: Set up GitHub in Daspire

  1. Select GitHub from the Source list.

  2. Enter a Source Name.

  3. Authenticate with OAuth or Personal Access Token. To load balance your API quota consumption across multiple personal access tokens, input multiple tokens separated with ,.

  4. GitHub Repositories - Enter a list of GitHub organizations/repositories, e.g. daspirehq/daspire for single repository, daspirehq/daspire daspirehq/daspire2 for multiple repositories. If you want to specify the organization to receive data from all its repositories, then you should specify it according to the following example: daspirehq/*.

CAUTION: Save & Test reports inaccessible repositories, unknown repositories, unknown organizations, and organizations with no accessible repositories before the source is saved.

  1. Start date (Optional) - The date from which you'd like to replicate data for streams. For streams which support this configuration, only data generated on or after the start date will be replicated.
  • These streams will only sync records generated on or after the Start Date: comments, commit_comment_reactions, commit_comments, commits, deployments, events, issue_comment_reactions, issue_events, issue_milestones, issue_reactions, issues, project_cards, project_columns, projects, pull_request_comment_reactions, pull_requests, pull_requeststats, releases, review_comments, reviews, stargazers, workflow_runs, workflows.
  • The Start Date does not apply to the streams below and all data will be synced for these streams: assignees, branches, collaborators, issue_labels, organizations, pull_request_commits, pull_request_stats, repositories, tags, teams, users
  1. Branch (Optional) - List of GitHub repository branches to pull commits from, e.g. daspirehq/daspire/main. If no branches are specified for a repository, the default branch will be pulled.

  2. Page size for large streams (Optional) - Controls the GitHub API page size for streams that can return a large amount of data. Values between 10 and 30 are recommended for large repositories; the allowed range is 1 to 100.

  3. Click Save & Test.

Supported streams

This source outputs the following full refresh streams:

This source outputs the following incremental streams:

Notes

  1. Only 4 streams (comments, commits, issues and review comments) from the listed above streams are pure incremental meaning that they:
  • read only new records;
  • output only new records.
  1. Streams workflow_runs and worflow_jobs is almost pure incremental:
  • read new records and some portion of old records (in past 30 days);
  • the workflow_jobs depends on the workflow_runs to read the data, so they both follow the same logic docs;
  • output only new records.
  1. Other 19 incremental streams are also incremental but with one difference, they:
  • read all records;
  • output only new records. Please, consider this behaviour when using those 19 incremental streams because it may affect you API call limits.
  1. Sometimes for large streams specifying very distant start_date in the past may result in keep on getting error from GitHub instead of records. In this case Specifying more recent start_date may help. The "Start date" configuration option does not apply to the streams below, because the GitHub API does not include dates which can be used for filtering:
  • assignees
  • branches
  • collaborators
  • issue_labels
  • organizations
  • pull_request_commits
  • pull_request_stats
  • repositories
  • tags
  • teams
  • users

Performance consideration

The GitHub integration should not run into GitHub API limitations under normal usage. Refer to GitHub article Rate limits for the REST API.

Troubleshooting

Max number of tables that can be synced at a time is 6,000. We advise you to adjust your settings if it fails to fetch schema due to max number of tables reached.