Opened 2 years ago

Closed 13 months ago

#5352 closed defect (fixed)

generate_static_pages cannot deal with directories being turned into regular pages

Reported by: trev Assignee: atudor
Priority: P3 Milestone:
Module: Sitescripts Keywords:
Cc: kvas, sebastian Blocked By:
Blocking: Platform: Unknown / Cross platform
Ready: yes Confidential: no
Tester: Unknown Verified working: no
Review URL(s):

https://codereview.adblockplus.org/29887585/

Description (last modified by trev)

Background

This commit turned /committee/members/ directory into a /committee/members page. As a result, the page generation on the server was failing:

Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/opt/cms/cms/bin/generate_static_pages.py", line 155, in <module>
    generate_pages(args.source, args.output, args.rev)
  File "/opt/cms/cms/bin/generate_static_pages.py", line 123, in generate_pages
    write_file([locale] + page.split('/'), pagedata)
  File "/opt/cms/cms/bin/generate_static_pages.py", line 54, in write_file
    with codecs.open(outfile, 'rb', encoding=encoding) as handle:
  File "/usr/lib/python2.7/codecs.py", line 878, in open
    file = __builtin__.open(filename, mode, buffering)
IOError: [Errno 21] Is a directory: u'/var/www/acceptableads.com/en/committee/members'

For reference: the file is read before writing to avoid changing modification time unnecessarily (caching it tied to the modification time).

What to change

The CMS should be able to replace directories by regular files and vice versa.

Change History (9)

comment:1 Changed 2 years ago by trev

  • Description modified (diff)

comment:2 follow-up: Changed 2 years ago by matze

Since the invocation of cms.bin.generate_static_pages requires the destination as a parameter, one *could* declare this a limitation (or just intentional) and recommend (read: document) the user (or script) to always generate a clean build.

In that case the infrastructure code should start generating the content in /some/place/by/$timestamp and (re-) link /var/www/$generated_pages accordingly on success. This would also allow for an easy rollback when necessary.

Feel free to re-articulate this ticket if you think this approach is sufficient (so that it is about documenting, not coding). I'll create a new infrastructure ticket either way.

comment:3 in reply to: ↑ 2 Changed 2 years ago by trev

Replying to matze:

Since the invocation of cms.bin.generate_static_pages requires the destination as a parameter, one *could* declare this a limitation (or just intentional) and recommend (read: document) the user (or script) to always generate a clean build.

It's not intended to have this limitation. In fact, it specifically has code to remove outdated files which no longer exist in the current revision.

In that case the infrastructure code should start generating the content in /some/place/by/$timestamp and (re-) link /var/www/$generated_pages accordingly on success. This would also allow for an easy rollback when necessary.

It will also change the modification time for each file every 10 minutes. With caching usually being tied to the modification time one way or another, this is rather suboptimal. I already spelled out that constraint in the issue description explicitly.

comment:4 Changed 2 years ago by kvas

If the intended behavior of generate_static_pages script is to update target directory to have the same content as the website that would be produced from the source directory (or rather target revision in the repository, that lives in the source directory, grumble grumble), then I agree that inability to deal with directories is a bug.
I'm not a big fan of combining only loosely related functionality in this way, and my preference would be for something like clean generation by cms + re-linking (or perhaps clean generation + rsync, if we want to avoid touching the files that didn't change). In general, if you ask me, I'd move all unrelated functionality (like dealing with the version control system, cache control, etc.) out of cms. It increases complexity without a good reason and has already caused several subtle bugs/unexpected behaviors like this one.
Having said that, I don't feel very strong about doing it right now. So if we don't want to go refactor the whole approach to website generation just yet, I'd be happy to just document currently intended behavior and fix the directory->file failing case... and probably add a couple of tests for this while we're at it.

comment:5 Changed 2 years ago by kvas

  • Priority changed from Unknown to P3
  • Ready set

comment:6 Changed 14 months ago by atudor

  • Owner set to atudor

comment:7 Changed 14 months ago by atudor

  • Review URL(s) modified (diff)
  • Status changed from new to reviewing

comment:9 Changed 13 months ago by atudor

  • Resolution set to fixed
  • Status changed from reviewing to closed
Note: See TracTickets for help on using tickets.