Opened 4 years ago

Closed 4 years ago

#3139 closed defect (fixed)

Provisioning via Puppet fails during HG pull

Reported by: matze Assignee: matze
Priority: P1 Milestone:
Module: Infrastructure Keywords:
Cc: fhd, fred Blocked By: #2588
Blocking: Platform: Unknown / Cross platform
Ready: yes Confidential: no
Tester: Unknown Verified working: no
Review URL(s):

Description

When attempting to provision any box in production, the following error occurs:

$ ./kick.py -r puppetmaster.adblockplus.org -t web-abb-org-1
Updating data on the puppet master...
abort: no suitable response from remote hg!
Traceback (most recent call last):
  File "./kick.py", line 58, in <module>
    updateMaster(options)
  File "./kick.py", line 47, in updateMaster
    runCommand(options.user, options.remote, remoteCommand)
  File "/home/mhennig/AdblockPlus/infrastructure/run.py", line 113, in runCommand
    subprocess.check_call(command)
  File "/usr/lib/python2.7/subprocess.py", line 540, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['ssh', 'puppetmaster.adblockplus.org', 'sudo hg pull -qu -R /etc/puppet/infrastructure && sudo hg pull -qu -R /etc/puppet/infrastructure/modules/private && sudo /etc/puppet/infrastructure/ensure_dependencies.py /etc/puppet/infrastructure']' returned non-zero exit status 255

A manual invocation of the hg pull commands on the puppetmaster revieled the following issue, which requires user interaction by default:

Warning: the ECDSA host key for 'hg1.adblockplus.org' differs from the key for the IP address '213.239.205.52'
Offending key for IP in /etc/ssh/ssh_known_hosts:72
Are you sure you want to continue connecting (yes/no)? n
Host key verification failed.

While the ECDSA host key is correct, using that particular key is actually not intended: In #2588, we tried to ensure the RSA keys, which are integrated with the various /etc/ssh/ssh_known_hosts files via Puppet, being used by default.

This is actually the first time this did not work out, for whatever reason. The quick-fix applied confirms the issue: When updating /etc/ssh/sshd_config on the Mercurial server to recognize the RSA key only, the issue is not reproducible any more.

Change History (2)

comment:1 Changed 4 years ago by matze

The patch that has been applied as a quick-fix:

$ kick.py -t hg1
@@ -9,8 +9,8 @@
 Protocol 2
 # HostKeys for protocol version 2
 HostKey /etc/ssh/ssh_host_rsa_key
-#HostKey /etc/ssh/ssh_host_dsa_key
-#HostKey /etc/ssh/ssh_host_ecdsa_key
+HostKey /etc/ssh/ssh_host_dsa_key
+HostKey /etc/ssh/ssh_host_ecdsa_key
 #Privilege Separation is turned on for security
 UsePrivilegeSeparation yes
 

Note that this requires an immediate fix (or a better quick-fix, at least):
Each HG+SSH client that did not use the RSA key with the own known_hosts file so far will now prompt with a warning similar to the one from the issue description.

comment:2 Changed 4 years ago by matze

  • Resolution set to fixed
  • Sensitive unset
  • Status changed from new to closed

After deleting the ECDSA public key for hg1 from puppetmaster:/root/.ssh/known_hosts provisioning works as intended again.

I have no idea which previous command caused the ECDSA version to be included there in the first place (while the RSA version was available in /etc/ssh/ssh_known_hosts), but I believe we can safely assume that this was an accidental side-effect from debugging one of the recent HG synchronization or the performance issues.

Note: See TracTickets for help on using tickets.