Opened on 06/04/2018 at 11:20:32 PM

Closed on 07/11/2018 at 05:17:45 PM

#6728 closed defect (fixed)

[CMS] Mercurial 4.6 release breaks static content generation

Reported by: jsonesen Assignee: jsonesen
Priority: P3 Milestone:
Module: Unknown Keywords:
Cc: kvas, snoack, rhowell, matze, juliandoucette Blocked By:
Blocking: Platform: Unknown / Cross platform
Ready: yes Confidential: no
Tester: Unknown Verified working: no
Review URL(s):

https://codereview.adblockplus.org/29805580/
https://codereview.adblockplus.org/29827639/

Description

Environment

MacOS / Debian Linux
Mercurial 4.6

How to reproduce

(if running older release of mercurial):

from root of the cms add mercurial to tox.ini dependencies
run tox

(if running 4.6 release):

run tox on the repo

Alternatively (from cms root):

hg archive -r default -t uzip - | less
...

Observed behaviour

All cms tests which rely on static content generation throw errors in tox.

Mercurial throws:
abort: Illegal seek Exception IOError: (29, 'Illegal seek') in <bound method ZipFile.__del__ of <zipfile.ZipFile object at 0x10a2d1ed0>> ignored

Expected behaviour

Tests should pass with no errors

Notes

After a bit of digging I found this commit in which the developers removed their reimplementation of os.popen and replaced it with a subprocess class handler which uses PIPE as stdout. Unfortunately the ZipFile library (which archives repos in the hg source) uses seek() quite a lot and pipes are not 'seekable' as does subprocess.check_output() and the commit message says explicitly they are dropping textIO support for this repo which we rely on here when getting the archive data. Also, archiving is not broken when actually archiving to a file rather than stdout, and even works with hg archive -r default -t uzip - > archive.zip

I do think this is an external dependency issue but I don't feel strongly that this is a regression in Mercurial, and I also think that our use of StringIO to directly populate the archive/source data is something that *could* be changed, the other option is to pin Mercurial at 4.5 which is a less ideal solution IMHO.

Attachments (0)

Change History (14)

comment:1 Changed on 06/05/2018 at 03:25:33 PM by kvas

  • Cc matze added
  • Priority changed from Unknown to P3
  • Ready set

Thanks for investigating this, Jon. I think pinning Mercurial is not a good option because it will be quite hard to control this in different environments. So our two options would be saving the archive to a temporary file or removing the HG integration and MercurialSource. My preference is the latter but we need to check what relies on Mercurial integration first (if anything). Most likely we will need input from Ops (adding Matze for this).

comment:2 follow-up: Changed on 06/06/2018 at 04:07:24 PM by kvas

  • Cc juliandoucette added

I have discussed this with Matze today. The position of Ops (as I understood it) is:

  • They would prefer CMS to not be dependent on Mercurial,
  • The feature where the CMS adds a version number to the URLs of static resources is important for caching and it should be kept (we have two options here: quick and dirty of just using a random number or a more comprehensive one of using some kind of a hash of website content; Matze suggested that the former is sufficient for all practical purposes),

I would therefore propose that we the approach of removing Mercurial-dependence (so both generation and test server will use FileSource) while simultaneously adding the version functionality to FileSource unless we see objections.

I've just realized that Julian is not CCd on this ticket, and he is one of the main stakeholders of this change. Julian, is it clear to you what we're planning to do and do you have any input?

comment:3 Changed on 06/06/2018 at 05:21:14 PM by juliandoucette

I've just realized that Julian is not CCd on this ticket, and he is one of the main stakeholders of this change. Julian, is it clear to you what we're planning to do and do you have any input?

  1. IIRC wspee has already started removing mercurial dependency from the CMS. I assume you're aware of his progress?
  2. I don't know what you mean by version functionality?

comment:4 Changed on 06/06/2018 at 05:28:18 PM by kvas

Thanks, Julian.

  1. Yeah, I've seen Winsley's patch and we've discussed this with him. This ticket is a bit different in that it removes Mercurial dependency completely, not as a command line option.
  2. The version functionality is adding ?<version> to static resource URLs in the generated pages to make sure that outdated cached static resources are not served with new versions of the pages. <version> here is coming from Mercurial, so we'll have to implement it differently.

comment:5 Changed on 06/06/2018 at 05:29:33 PM by juliandoucette

Thank you for clarifying. Your solutions SGTM.

comment:6 in reply to: ↑ 2 Changed on 06/06/2018 at 05:41:06 PM by jsonesen

The FileSource implementation of versioning for static URL's would be better if it wasn't random since that could make testing a bit tricky and also not very useful for human eyes. Perhaps leveraging difflib to track the changes between generations could be the key,? if there is a diff since the last generation increment the version.

comment:7 follow-up: Changed on 06/06/2018 at 05:48:56 PM by kvas

  • For testing we can mock the randomness, so that's ok.
  • If we want to avoid changing version parameter when the content didn't change, the easiest way would probably be to hash the website sources into a short string. I'm not really sure if it's worth it and we will probably still have to mock it in tests (otherwise small changes to test website will change all the URLs in all expected outputs).

comment:8 in reply to: ↑ 7 Changed on 06/06/2018 at 05:53:03 PM by jsonesen

Replying to kvas:

  • For testing we can mock the randomness, so that's ok.
  • If we want to avoid changing version parameter when the content didn't change, the easiest way would probably be to hash the website sources into a short string. I'm not really sure if it's worth it and we will probably still have to mock it in tests (otherwise small changes to test website will change all the URLs in all expected outputs).

Sounds good.

comment:9 Changed on 06/12/2018 at 10:36:23 PM by jsonesen

  • Review URL(s) modified (diff)
  • Status changed from new to reviewing

comment:10 Changed on 06/12/2018 at 10:37:16 PM by jsonesen

  • Owner set to jsonesen

comment:11 Changed on 07/10/2018 at 03:17:04 PM by abpbot

A commit referencing this issue has landed:
Issue 6728 - Remove Mercurial dependency

comment:12 Changed on 07/11/2018 at 05:14:51 PM by jsonesen

  • Review URL(s) modified (diff)

comment:13 Changed on 07/11/2018 at 05:17:24 PM by abpbot

A commit referencing this issue has landed:
Issue 6728 - Remove dead hg revision argument

comment:14 Changed on 07/11/2018 at 05:17:45 PM by jsonesen

  • Resolution set to fixed
  • Status changed from reviewing to closed

Add Comment

Modify Ticket

Change Properties
Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from jsonesen.
 
Note: See TracTickets for help on using tickets.