yayi C++, python, image processing, hacking, etc

Managing several SSH identities explained

Forewords

Juggling between several identities with ssh can be tedious. This short article explains how ssh identities work and shows various ssh configuration you can use on both Linux and macOS. We'll start by introducing the problems, and then explore different configurations to overcome those.

Requirement: being familiar with ssh, ssh-agent, private/public keys.

Problem statement

As many of developers, I develop and push code for my work and for my private projects. I never mix both worlds. There is a high chance that the services I am using (such as Bitbucket or Github) will be used, at some point, for personal and professional projects. To avoiding mixing things, the first thing to do is to have a dedicated account for your work projects, and that is different from your personal account. You have then 2 identities for the same remote services.

How to configure the accesses? We want for instance to use the same computer and same tools (git/ssh), without having to run too many commands.

My typical macOS session starts like this:

ssh-add ~/.ssh/my_id_rsa_private_key
cd /Users/raffi/my-code/yayi
git fetch --all --prune

Then I code in happiness. The local copy of the repository is configured with ssh and the remote address looks like git@bitbucket.org:raffi/some-repository. In the above commands, git uses ssh under the hood, and this is well understood.

When an SSH connection is established, a sequence of authentication attempts (roughly speaking) is made to authorize the connection. First the connection being established performs the connection through a user that should exist on the remote. In the previous case the part that is before the @, which is git. However, for a number of remote connections, this user is the same: your name is not git. This means that the user, in the ssh sense, does not carry much of our identity for the connection and is here not relevant. This user however should exist on the remote. This is just to hook the connection for then performing the authentication.

The ssh client then tries various authentication methods in some order: it can for instance first show a unique private key to the remote server. If that fails - the remote rejects that key - the client then may present the keys stored in the ssh-agent. If that fails again it can at the end fallback to a password prompt. You can see all this by calling the verbose mode of the ssh client with ssh -v.

All of this is very basic and again well understood. There are three important bits here though:

  1. the order of the authentication methods,
  2. the configuration of the remotessh server,
  3. the authentication vs. authorization.

It seems easy to understand but the devil is in the details, let's take the time to explain those notion on concrete examples.

Authentication vs. authorization vs. order

A common scenario is this:

  • you have a personal account personal set up with your personal public key,
  • you have another account work set up with your work public key. On this set up, you generally do not use your personal key for several good reasons

It seems easy to solve: just add all of your private keys to your ssh-agent and let the ssh client do the work. Where is the catch?

The problem is that you do not know in advance the order of the keys that are presented to the ssh server.

  1. you want to connect to a work project and your ssh-agent presents your personal key first
  2. the remote service identifies you as the user you use for personal projects: you are now authenticated on the remote service,
  3. once your identity has been established, you want to access work projects on which your personal user has no rights
  4. the git server rejects your request: the repository is not configured to use your personal account (which is a good thing, see *here again)

The connection is terminated, game over, and you have no chance to re-do this loop with your next key in your ssh-agent, because ssh did its job right: you successfully authenticate to the remote service.

To summarize:

  • this problem exists mainly because you have various accounts on a remote service,
  • you cannot really decouple authentication from authorization: once authenticated with one identity, this identity is used for doing other type of operations
  • you can always try hard to convince your ssh-agent to present keys in a certain way, but you will have the same issue for the other account
  • you cannot possibly configure your work project to using your personal account, that is the opposite of what we want to achieve.

Before starting digging into the solutions, I would like to introduce another issue I encountered.

Remote configuration

Another issue that has to do with order of the authentication is the configuration of the remote server. Indeed, some servers are configured to terminate immediately if any attempt to authenticate fails. This means that, even if the remote service knows only about one of your many identities, it may not wait until your ssh-agent is finally able to present the right key for that service.

To summarize, how the ssh server handles many attempts is a service side configuration that you cannot manage. So you better have to be precise on how you access some of the remote services.

Managing several identities w. SSH: the easy way

You are proficient in using various terminal windows, or the excellent tmux. This section is for you:

  1. start a new terminal
  2. type eval (ssh-agent -s)
  3. add a single private key to the newly created agent, and use this agent to manipulate a single identity and nothing else.

The command eval (ssh-agent -s) creates a new ssh-agent and populates the right environment variables such that all ssh commands (including ssh-add) are using the newly created agent.

You see the environment variables you need by just typing

env | grep SSH

The advantage of this approach is that it is dead simple to execute and to remember, and you do not have to Stackoverflow it every time you start a new shell.

Main drawback are:

  • you loose one big advantage of the ssh-agent: when you have several tabs/shells and you want to share the same ssh identity between those, there is some manual work to do. The trick is to copy the environment variable SSH_AUTH_SOCK from one tab (the one that contains the ssh-agent) to another,
  • your ssh-agent is mostly bound to a shell: you close the shell, you loose the possibility to reuse the agent (if you did not copy the environment variable mentioned above). That is not really an issue IMO with tools such as tmux as you can organize your shells in a easy way,
  • you end up with millions of ssh-agent processes on your computer (just use killall in that case).

All in all, it seems that the advantages outweigh the drawbacks, and this can just be the method of choice.

Some explanations

The easy way, as we saw, is to start a brand new ssh-agent, add the necessary keys, stay in the same terminal, and perform various ssh operations in that context. But how does it work?

There is no magic here: eval (ssh-agent -s) starts the agent and injects various environment variables into the current shell. One of those is SSH_AUTH_SOCK: it points to a newly created unix socket file with which, using only the default behavior, will be used by the ssh and ssh-add programs. If you remove that variable from your shell, the other programs cannot guess the agent they should be talking to.

All of a sudden, we have to deal with new notions:

  • unix socket file
  • environment variable
  • default behavior, whatever it means

A unix socket file is just a file which, roughly speaking, acts as a communication pipe between various processes. In our case, the socket file is created by the ssh-agent, and used by the ssh client to query about the ssh keys. I do not know enough about the internals of ssh, but my guess is that any process of the current machine (under the condition that that process has rights to that socket file), can talk to the ssh-agent through this socket file: this exposes your secrets and renders the ssh-agent rather weak (we will talk more about this in this section). This weakness can be mitigated for instance by hiding the name of that socket file as much as we can.

How can then the agent possibly know which socket file to use, as there are so many files on your computer and the name of the socket file cannot be guessed? The answer is environment variables: when the agent starts with the -s option, it exposes the name of that socket file in the environment variable SSH_AUTH_SOCK. There are some nice and nasty properties with environment variables that we will address later.

Just remember this: knowing the SSH_AUTH_SOCK (or the socket file) is equivalent to accessing the ssh-agentand the identities stored in. In French: ça sent mauvais.

You can always copy the content of this environment variable to another shell, and by doing so the programs started in this other shell will use the ssh-agent started on that socket file.

To summarize: the easy method is using environment variables together with socket file with unpredictable name to make things work, and this is the default behavior because it works well in most of the cases and it has good security/privacy contract.

Tweaking ssh and its companions

As we have a method that works for a large number of scenario, why bother more in tweaking ssh and the associated tools?

Well... none. Or maybe... having a deeper understanding of how ssh works together with the ssh-agent, and what alternative solutions could possibly be, what their respective drawback are, etc. Expanding our knowledge of ssh seems fully justified in regards to its wide usage in various contexts.

What could possibly be a non-default behavior, and how to change it? As an answer to the first question, we just saw environment variables and unix socket files. There are basically two ways of changing the default behaviour:

  • crafting the ~/.ssh/config file: this is a globally visible configuration and is independent from the shell that you are in,
  • passing environment variables and command line switches

What is ~/.ssh/config?

The ~/.ssh/config file is a configuration file read by the ssh set of tools every time they require an access to a remote. It has sections organized by remote hosts: those sections indicate various parameters for connecting to the host, such as the real hostname/dnsentry to hit, the IP address, how it should perform the authentication, etc. The various section of the ~/.ssh/config are well described on the web (usually under ssh-config).

Note: ~/.ssh/config can be a weakness in your ssh configuration as it drives the whole authentication process.

Managing several identities in ~/.ssh/config

Coming back to what we want to achieve: how do we set up the ssh-config such that we can manage various identities?

Let's start with an example of ~/.ssh/config:

host my.server.com
    <configkey> <configvalue>

Now when you type ssh someuser@my.server.com, the ssh client pulls the ~/.ssh/config file and looks for an entry for my.server.com. If no entry is found, then the default behavior applies. Otherwise, each of the configkey in the relevant section override the default behavior.

One of the possible configkey is HostName, which indicates the real name of the machine that ssh should connect to. This overrides the default behavior of using the remote machine as being the one that comes after the @ in the command ssh someuser@my.server.com. This means that in our previous example my.server.com can be the real name of the remote or a logical name.

A possible implementation for accessing the same remote with different configurations can then be to set up the various configurations as follow:

host my.server-config1
    HostName my.server.com
    # additional configs

host my.server-config2
    HostName my.server.com
    # additional configs

Both configurations hit the same real remote machine my.server.com, but when you type ssh someuser@my.server-config1 this will use its special configuration from ssh-config. Of course the additional configuration part can be exactly the same, but what is important here is that we have created two logical and decoupled configurations for accessing the same remote my.server.com.

This propagates nicely to other tools that are using ssh, such as git, without any extra step:

git clone git@my.server-config1:personal/repo-personal
git clone git@my.server-config2:professional/repo-pro

makes git call ssh with the access we want from the ssh-config. The first line uses the section from ssh-config that is under my.server-config1 and that hits the server my.server.com, and this just works even if my.server-config1 is not a real/existing remote machine. Pretty neat!

We see that our initial problem can be easily solved if we are able to specify which identity to use in each of those logical sections. And this is what we will tackle now.

You can spent quite an amount of time with the various configuration of the ssh-config file. There are however two keys that are relevant to us: IdentityFile and IdentityAgent.

Specifying the identity file

The entry IdentityFile can be used to specify the ssh key-file to use for a dedicated access. The idea here is to indicate the ssh client which identity to use by pointing to a specific key file.

host my.server-config1
    HostName my.server.com
    IdentityFile /path/to/public-key-identity1

host my.server-config2
    HostName my.server.com
    IdentityFile /path/to/public-key-identity1

The previous example indicates the private keys. Unless those private keys are not passphrase protected (you should never do this), the only way to make this work is to add the corresponding identities to the ssh-agent the ssh client communicates with. However there is a catch: since ssh will communicate with the ssh-agent, the identity being used for establishing the connection can be any of the ones offered by the agent or the ssh-config file:

  • the ssh client may look for a matching private key in the ssh-agent and present only this one
  • the ssh client may first try all the keys stored in the ssh-agent regardless of the IdentityFile entry, and if all keys fail then present the file corresponding to the IdentityFile (which could never work unless the private key is not protected, you should never do this "bis repeta").

To limit the attempts to the keys that are mentioned in the IdentityFile section, an additional option IdentitiesOnly should be set to yes. Here is the trace of the logs with an agent containing two different passphrase protected keys (only relevant parts of the logs kept, so that you know where to focus):

> ssh-add -l
2048 SHA256:xyz1 /Users/raffi/.ssh/key-perso (RSA)
2048 SHA256:xyz2 /Users/raffi/.ssh/key-pro (RSA)

> more ~/.ssh/config
host bitbucket-perso
    hostname bitbucket.org
    IdentityFile ~/.ssh/key-perso
    IdentitiesOnly yes

> ssh -v git@bitbucket-perso
OpenSSH_7.9p1, LibreSSL 2.7.3
debug1: Reading configuration data ~/.ssh/config
[snip]
debug1: identity file ~/.ssh/key-perso type 0
[snip]
debug1: Will attempt key: ~/.ssh/key-perso RSA SHA256:xyz1 explicit agent
[snip]
debug1: Offering public key: ~/.ssh/key-perso RSA SHA256:xyz1 explicit agent
debug1: Server accepts key: ~/.ssh/key-perso RSA SHA256:xyz1
debug1: Authentication succeeded (publickey).
[snip]
logged in as super-user

Without the IdentitiesOnly set to yes, we can see this:

> ssh -v git@bitbucket-perso
[snip]
debug1: Will attempt key: ~/.ssh/key-perso RSA SHA256:xyz1 explicit agent
debug1: Will attempt key: ~/.ssh/key-pro RSA SHA256:xyz2 explicit agent
[snip]
debug1: Offering public key: ~/.ssh/key-perso RSA SHA256:xyz1 explicit agent
debug1: Server accepts key: ~/.ssh/key-perso RSA SHA256:xyz1 explicit agent

The last 2 lines show that the authentication succeeded, but we are not able to specify the order with which the keys are attempted with the remote, and this matters: we are trapped again in the same problem as mentioned earlier.

To conclude this section, specifying the identity file and restraining the behavior of the ssh client to the mentioned keys seems to be an easy and powerful way to manage various identities. We used standard behavior of ssh together with very little configuration of the ssh-config, and used one single instance of the default ssh-agent (often started with your login).

Most importantly, with this method we are not obliged to remember complicated commands: ssh-agent as usual for adding protected keys to our agent, and ssh to connect. The only thing to remember with this is the name of the remotes for endorsing the various identities, but this can be managed quite easily if you define your own naming convention (such as github-perso and github-pro, xxx-perso and xxx-pro). Another shortcoming is that the key to use is hard coded inside the ssh-config and this is not desirable in some scenarios.

We now will dig into a more intricate way of managing identities with various agents.

Specifying the agent

Another interesting entry of the ssh-config file is the IdentityAgent that indicates the ssh-agentsocket to use for a particular access. We mentioned earlier an important fact that we repeat here: knowing the ssh-agent socket is equivalent to accessing the stored identities of that ssh-agent.

We now link each of our individual logical accesses to a particular instance of the ssh-agent, and our ssh-config may now look like this:

host bitbucket-id1
    HostName bitbucket.org
    IdentityAgent ~/agent-socket-id1

host bitbucket-id2
    HostName bitbucket.org
    IdentityAgent ~/agent-socket-id2

Note that the ssh-config file is something for the ssh client, we still need to be able to

  1. start various agents, with a specific socket each
  2. populate each agent with identities

Starting a new ssh-agent on a specific socket is easy:

ssh-agent -a ~/agent-socket-id1

In the above line, -a ~/agent-socket-id1 creates a unix socket file which is used by the ssh client for communicating the credentials. The char ~ is expanded from the user environment and should work most of the time (same goes for the ssh client when it interprets the ssh-config file).

Communicating with this particular agent can then be done by specifying the authentication socket file through the environment variable (again) SSH_AUTH_SOCK:

SSH_AUTH_SOCK=~/agent-socket-id1 ssh-add ~/.ssh/my-identity-private-key

Let's list now the ssh-agent processes on the machine:

> ps -ef | grep ssh-agent | grep -v grep
  502  1179     1   0 27Jun19 ??         0:00.19 /usr/bin/ssh-agent -l
  502 32576     1   0 11:26PM ??         0:00.00 ssh-agent -a ~/some-agent-socket

We have indeed two agents: the first one has been started by the OS automatically at login time, while the second we just started manually.

In the list of processes, we clearly see the unix file socket of the agent, and this list of processes is most of the time visible by all users currently logged in to the computer. Moreover, even if we manage to hide the name of the socket from the list of processes, the file still appears in the ./ssh/config file.

This has serious security implications because, as we already mentioned, "knowing the unix socket <=> accessing your secrets". We will discuss about mitigation later.

The ssh-agent started from the operating system has however not such a unix file socket visible, as it uses the default behavior through the environment variable SSH_AUTH_SOCK.

Before discussing about the pros-cons of this method, let's discuss the "why?" of the security concerns.

Security concerns

Why are we so concerned about security? Well, Secured SHell or ssh has security at its core design. Security is not only about the communication with a remote machine, it is also about your own setup such that nobody steals your identity. We want to stay secure, usually the problem is not the tool, it is the human.

Let's do a simple experiment:

# starting an agent on a specific socket
> ssh-agent -a /some/path/agent-id1
> SSH_AUTH_SOCK=~/agent-id1 ssh-add ~/.ssh/key-perso
> sudo -u otheruser bash
> SSH_AUTH_SOCK=/some/path/agent-id1 ssh-add -l
2048 SHA256:xxxx+somethingsomethingsomething /Users/raffi/.ssh/key-perso (RSA)

What this experiment does is the following:

  1. as my user, I create a new agent with a authentication socket file pointing to /some/path/agent-id1, then I add a key to that agent
  2. then I log in as another user otheruser and I list the keys available to the ssh client with the command ssh-add -l (fingerprints only).

What we learned from this experiment is the following: all the keys stored in my personalssh-agent can be used by another user (including root, never trust your sysadmin), which is a serious security issue.

There are ways to mitigate this:

  • indicating the correct file permission on the socket file
  • using environment variables
  • hiding the name of socket file
  • a combination of the above

Socket file permissions

Context: we are digging the scenario where the ssh-config file contains the entry IdentityAgent indicating the socket file to use, and we have to start an ssh-agent explicitly on that socket file. Hence the name of the socket is visible by all.

The first thing to do is to protect the unix socket file with proper and exclusive credentials. This is normally done by the command ssh-agent itself, but we are never sure enough, for instance:

> ls -al $SSH_AUTH_SOCK
srw-rw-rw-  1 raffi  wheel  0 Aug 12 10:50 /private/tmp/com.apple.launchd.Z9R9g9AS0A/Listeners

is world readable/writable ...

In any case, the socket file is still reachable by a user with enough privileges on the machine. On a shared machine/workstation, this may be an issue if you put this socket file in a globally accessible folder, such as /tmp or under your home folder that may be not protected enough. On your own personal laptop, you may think that it is usually not such a problem as you are not sharing the machine with any other user ... except from time to time with some people that hacked a process that is running as another user, or by some services you installed because you trusted their providers (or you thought it is a good idea to wget -q -O - http://raffi.io/install | sudo bash for installing python3, see here).

So maybe indicating the socket file is not a good idea ? As mentioned earlier, we give too many hints to get hit by a security breach. The default behaviour on the other hand uses environment variables. This implies that we do not control the name of the socket file, and this file name is inside an environment variable.

Environment variables

Doing a simple man ssh_config indicates that the configuration section IdentityAgent can interpret values starting with a $ sign as an environment variable to take from the user. Means that

host bitbucket-id1
    HostName bitbucket.org
    IdentityAgent $SSH_AUTH_SOCK_ID1

should just use the variable $SSH_AUTH_SOCK_ID1 taken from the environment and should work like a charm. Let's try: we now use the default behavior of ssh-agent -s (that should in theory already create a file name in secure way) and replace on the fly the default bash environment variables SSH_AUTH_SOCK by our new environment variable $SSH_AUTH_SOCK_ID1:

eval $( ssh-agent -s | sed s/SSH_AUTH_SOCK/SSH_AUTH_SOCK_ID1/g )
chmod go-rw $SSH_AUTH_SOCK_ID1

After launching the ssh-agent, we can see those environment variables:

>  env | grep SSH
SSH_AGENT_PID=62091
SSH_AUTH_SOCK=/private/tmp/com.apple.launchd.Z9R9g9AS0A/Listeners
SSH_AUTH_SOCK_ID1=/var/folders/jw/6tjl__hx7zn2f1n6z6rcr5_h0000gn/T//ssh-AzjElgidL6Ol/agent.62090

and for the processes:

>  ps  -ef | grep ssh | grep -v grep
  501  2795     1   0 12Aug19 ??         0:01.83 /usr/bin/ssh-agent -l
  501 62091     1   0  8:56AM ??         0:00.00 ssh-agent -s

So in terms of processes, nothing appears that can indicate where the socket file is. Then we use the ssh-add as before using the environment variables that indicate the identity to use. We do not really care where the socket file is located in fact, as it was the case with the default behavior.

SSH_AUTH_SOCK=$SSH_AUTH_SOCK_ID1 ssh-add ~/.ssh/my-identity-private-key

The environment variable $SSH_AUTH_SOCK_ID1 is seen by processes started in the same shell: if you use git, it will call ssh that in turn will read the ~/.ssh/config that will then interpret $SSH_AUTH_SOCK_ID1 that will get pulled by your environment variables (wow). The logic behind is that child processes inherit the environment variables of their parents, as a copy of those. git is a direct child process of your terminal, and ssh is a child process of git: $SSH_AUTH_SOCK_ID1 is shared in this chain of actions.

Interaction with other tools & graphical interfaces

All in all, this seems fine. But we just brought one big usability issue: processes spawn by eg. Finder or Gnome have a copy of the initial machine environment variables: anything started after those processes (Finder, etc) - and they usually start quite early - will not see your newly created environment variables. Basically you loose the ability to exploit your ssh-config with environment variables from graphical interfaces like Xcode or Sourcetree: the variable $SSH_AUTH_SOCK_ID1 does just not exist in the program.

You may find resources there and there on the Internet on how to inject environment variables to a process, but overall this is complicated if even doable.

On some systems, it may be possible to start the graphical program from a shell such that it inherits its environment variables, but this is not always easy or handy. For instance open /Applications/Sourcetree.app/ in my macOS shell does not fix the issue, while starting Sourcetree with /Applications/Sourcetree.app/Contents/MacOS/Sourcetree

works (same with QtCreator from the original bin/ folder). Indeed open /Applications/Sourcetree.app/ seems to be doing some inter-process communication with Finder, which does not pass the environment variables.

Note that this problem does not exist if the ssh-config file points to known/hard-coded socket files: in that case the paths are constant and not indirectly interpreted from environment variable interpolation by ssh. However, as we said, hard-coding socket file path in ssh-config is not secure.

Summary on specifying the agent

Unfortunately this part looks quite complicated. Part of the complication is:

  • you have to remember either the socket location or the environment variable in use for each of the identity. If you did a non-secure job, you will ultimately be tempted to list/filter the processes to remember. If you did more secure job, you will have to open your ~/.ssh/config quite often. You may use a naming convention for the environment variables to hit each of your identities.
  • the commands above may be error prone. Example: you may add an identity to the wrong ssh-agent instance if you forget to prepend with the right environment variable
  • your ~/.ssh/config becomes central and in some circumstances will require additional care (backup) and security (socket file name)
  • if you did a more secure job, this does not interact well with graphical interfaces

Conclusion

We have covered several ssh setup such that it is possible to manipulate several identities, and we went deep into the configuration options and what it means in terms of usage and security. As a recap:

the easy way

  • starts an ssh-agent inside a shell
  • is easy and does not require a lot of brain to start with
  • does not require additional configuration
  • works within a shell or several shells, but does not work with graphical interfaces (same problems as this one)
  • has good security guaranties

the ssh-config way with several identity file

  • is not so difficult to set up
  • has little overhead of usage (logical names of the machines)
  • still has good security guaranties
  • works with graphical interfaces

the ssh-config way with several ssh-agent

  • is hard to set up
  • has enormous overhead of usage
  • has variable security guaranties
  • may work with graphical interfaces
  • may put sensitive information in the ~/.ssh/config file

Addendum

Why not reusing your personal keys for your work projects?

It has been said earlier that you certainly will use different set of keys for the personal and the work accounts. There are several underlying good reasons for doing so instead of reusing the keys for the two accounts:

  1. It is sometimes just impossible: some service providers do not let the reuse of a public key, and you have to set up another key for that other account,
  2. Your work account might be federated by your an administrator (that is not you), he can manipulate your work account, and this would involve manipulating entities (keys) that are used on your personal projects. You can be in good terms with your administrator, but having a frontier with colleagues is also nice,
  3. and certainly many others (ping me if you have ideas...)

Some words about the ssh-config

One may argue that using a single ./ssh/config for managing several identities is mixing, in a single file, things that should not be mixed (the identities). A possible way of dealing with this would be to

  1. develop specific ssh wrapper scripts that pass additional options to the default ssh command
  2. have several configuration files, and then pass the right one with the -F <configuration-file> option of the ssh client. This is more or less the same as the previous one, at the expense of an additional configuration file that one need to remember. There is still an additional parameter (-F) that should be passed, which may induce the development of ssh wrappers as for the previous case.

Changing the global user configuration has several benefits over other type of solutions. This configuration file acts really as a small database centralizing all the accesses policies for several remote servers. Having a single and default configuration file has, among many, the following benefits:

  1. it centralizes to a unique single place all the logic needed for accessing remote servers. This eases the maintenance of those accesses a lot,
  2. it removes the need for developing specific ssh wrapper scripts that may pass additional parameters to the native ssh client.
  3. it works quite well with indirect ssh calls without additional configuration. One example is git: we do not need to manipulate extra variables such as GIT_SSH to make the ssh client behave as expected. Another nice property is that this still works with any GUI (such as the excellent Sourcetree for git).

The shortcoming is, as the configuration file grows, it becomes central and critical, and need proper care like backups. One may also want to synchronize all these logics among several computers, which may span several operating systems. It may also become harder to maintain.

However, all those topics are orthogonal to the main benefits of easily configuring and manipulating several SSH identities.