How to clone Github private repositories from Google Colab using SSH

Photo by Riku Lu on Unsplash

Motivation

Google Colaboratory (Google Colab or Colab in short) is popular among Deep Learning enthusiastic because it is a hosted Jupyter notebook service that requires no setup to use while providing free access to computing resources including GPUs. Basically, Colab provides a virtual machine with necessary Python packages for its users. Anyone can clone a public repository from Github and start training machine learning models by executing python scripts. However, this is not true for a private repository.

I am going to share a workaround to clone private repositories from Github using SSH protocol.

Go to Google Colab, create a new notebook, and click the button named **connect **in the top right corner of the page.

Step 1:

Now, generate SSH keys by running the following command from a code cell

!ssh-keygen -t rsa -b 4096

The public key algorithm is selected using the -t option and key size using the -b option.

After running the code cell by pressing Shift+Enter some instructions will appear.

> Enter a file in which to save the key (/home/*you*/.ssh/id_rsa): *[Press enter]*

> Enter passphrase (empty for no passphrase): *[Press enter]*

> Enter same passphrase again: *[Press enter]*

If any passphrase is provided by the user, SSH keys will be generated successfully. However, later it will causePermission deniederror while cloning the private repository from Github.

debug1: read_passphrase: can't open /dev/tty: No such device or address 
debug1: Trying private key: /root/.ssh/id_dsa 
debug1: Trying private key: /root/.ssh/id_ecdsa 
debug1: Trying private key: /root/.ssh/id_ed25519 
debug1: No more authentication methods to try. 
git@github.com: Permission denied (publickey).

So, no passphrase must be given while generating SSH keys. Just press Enter and leave the field empty.

Step 2:

Add SSH key fingerprints to the known_hosts file-

!ssh-keyscan -t rsa github.com >> ~/.ssh/known_hosts

This would only work properly if you have the SSH key authentication setup.

Step 3:

We need to add the newly created SSH key to the Github account. These steps are exactly the same as adding a new SSH key in Github for any local machine.

To achieve that, first get the SSH key by running this command and copy the output.

!cat /root/.ssh/id_rsa.pub

After that, follow these instructions from official Github documentation for adding a new SSH key to your Github account.

Test SSH key-

!ssh -T git@github.com

> Hi YourName! You've successfully authenticated, but GitHub does not provide shell access.

Step 4 (Optional):

By following the above four steps, one can successfully clone a private repository. If anyone wants to commit and push code changes from Google Colab to the private repo in Github, then must be specified Git configuration settings. It can be achieved by-

!git config --global user.email "your_email@example.com"

!git config --global user.name "your_name"

If everything goes well, you will able to clone private repositories from Github in Google Colab.

!git clone git@github.com:SadiaAfrinPurba/demo-private-repo.git

Do not forget to clone repositories via SSH URLs

Troubleshooting:

To debug what went wrong run this command- !ssh -vT git@github.com

The SSH key is valid for each Google Colab session which means for a new session you need to repeat the above steps. I have written a short and simple bash script to avoid doing repetitive tasks.

Clone this repository and run the following commands from Google Colab-

!cd google_colab_ssh
!sh private_repo_clone.sh "your_name@example.com" "your_name"

The script will only generate a new SSH key for Google Colab. You still need to add the SSH key to Github (Step 3) manually. I prefer to delete the SSH keys from Github after finishing the work in Google Colab.

Conclusion

I can achieve what I wanted to do by uploading the source codes into Google Drive. But I prefer to track any code versioning by using Git. I hope this tutorial will be helpful to anyone who is exploring machine learning and deep learning. Thank you!

Previous