Opened 13 months ago

Last modified 5 months ago

#5334 new change

Should CMS export HTML files with HTML extensions?

Reported by: juliandoucette Assignee:
Priority: Unknown Milestone: Websites continuous integration
Module: Sitescripts Keywords:
Cc: wspee, kvas, jsonesen, ire, saroyanm Blocked By:
Blocking: Platform: Unknown / Cross platform
Ready: no Confidential: no
Tester: Unknown Verified working: no
Review URL(s):

Description

Environment

GitLab CI

How to reproduce

Branch

Observed behaviour

Pages without extensions work:

But index does not resolve properly (because there is no .html extension):

Expected behaviour

index pages should resolve properly without specifying "index".


As has been stated already by kvas: We can manually add .html to the end of a page name. But I don't think we should have to... Perhaps we add .html to the end of all pages OR just index pages?

Change History (12)

comment:1 follow-up: Changed 13 months ago by kvas

I don't think it's fair to consider this a "defect" of CMS. You're trying to use it in a way that it was not designed for and some things don't work, that's to be expected. It's not a defect of the car that it can't fly, as nice as it would be if it could.

Anyway, I do agree that it would be useful to be able to give the files that CMS outputs reasonable file extensions according to their mime-type. I would propose to make it an option to generate_static_pages.py that is off by default. Does this sound about right?

comment:2 in reply to: ↑ 1 Changed 13 months ago by juliandoucette

  • Type changed from defect to change

Replying to kvas:

I don't think it's fair to consider this a "defect" of CMS. You're trying to use it in a way that it was not designed for and some things don't work, that's to be expected. It's not a defect of the car that it can't fly, as nice as it would be if it could.

Ack. I changed it to a change.

-- I think the defect description format is still effective here. Feel free to change it if you disagree.

Anyway, I do agree that it would be useful to be able to give the files that CMS outputs reasonable file extensions according to their mime-type. I would propose to make it an option to generate_static_pages.py that is off by default. Does this sound about right?

Yes.

Last edited 13 months ago by juliandoucette (previous) (diff)

comment:3 follow-up: Changed 13 months ago by juliandoucette

Note - In production:

  'https://adblockplus.org/': 200,
  'https://adblockplus.org/index.html': 404,
  'https://adblockplus.org/index': 200,
  'https://adblockplus.org/index/': 404,
  'https://adblockplus.org/en/about': 200,
  'https://adblockplus.org/en/about/': 404

Therefore we cannot make most web servers work like production by adding .html or not. But the closest we can do is add .html to index files only.


PS: Perhaps we should add some server side logic to resolve page names followed by '/' correctly if there is no folder with the same name?

comment:4 in reply to: ↑ 3 Changed 13 months ago by kvas

Given that we need .html extension only on particular pages in some cases, I thought that we probably need to be able to specify extension on individual pages. The obvious way to do it is to have a metadata field for it. It would default to no extension, but it seems that it would also be useful to be able to set another default (e.g. .html). Thus we would end up with the following list:

  • defaultextension configuration option in settings.ini,
  • --default-extension option to generate_static_files.py that overrides defaultextension configuration option.
  • extension metadata field that overrides all of the above.

Seems that it covers all the use cases described above. If there's no objection, I can put this into the text of the ticket and make it ready.

Last edited 13 months ago by kvas (previous) (diff)

comment:5 Changed 13 months ago by juliandoucette

Seems that it covers all the use cases described above.

I don't think so?

  • If I set --default-extension to html then non-index pages will require .html in their paths on GitLab
    • This may not be a problem if CMS also appends .html links via converters.py?
      • But it will still create a difference in functionality between GitLab (staging) and production
  • If I set extension to html then /index pages (without .html) will not resolve properly in production? (not that anyone uses /index anyway)

I know it's less elegant, but I think an *append html extension to just index pages* feature fixes the problem exactly.


  • I haven't tested these assumptions
  • What applies to GitLab here generally applies to any static webserver

comment:6 Changed 12 months ago by kvas

If the problem is only about index[.html] pages and only on GitLab (that is, you don't want to change the way those pages are named in production), perhaps the easiest approach would be to just add a build step in GitLab CI that copies all index files to index.html in the same directory. However, I don't think it's a good idea to create these differences between staging and production.

If we change index to index.html in production as well, then extension metadata field should take care of this just fine. Perhaps we don't need all the other options, but it seemed kind of more feature complete and not much overhead compared to implementing just extension metadata.

In any case, if this feature is implemented in some way, the links that CMS generates should respect the extension configuration for the page. At least that's how I always thought about it.

comment:7 Changed 12 months ago by ire

  • Cc ire added; iaderinokun removed

comment:8 follow-up: Changed 12 months ago by juliandoucette

Afterthought:

Static websites usually support paths ending in .html and / via index.html.

I think that our paths ending in page names without .html is more of a mistake then a feature. And I would prefer to support these paths by redirect rules if possible. e.g. if the path does not end in / or .html then serve or redirect to ${PAGE_NAME}/.

comment:9 in reply to: ↑ 8 Changed 12 months ago by kvas

Replying to juliandoucette:

I think that our paths ending in page names without .html is more of a mistake then a feature.

I tend to agree with you. I see an aesthetic argument for not having .html, but I think it works poorly in practice with a static site generator.

Anyway, given that we seem to be moving towards dockerized deployments of websites and also towards having build steps, would you like this implemented or perhaps it's not so needed? And if it is needed, perhaps we should have an IRC chat or a call with you to quickly hammer out the details. Seems like it would be faster than doing it in the ticket.

comment:10 Changed 12 months ago by juliandoucette

Anyway, given that we seem to be moving towards dockerized deployments of websites and also towards having build steps, would you like this implemented or perhaps it's not so needed? And if it is needed, perhaps we should have an IRC chat or a call with you to quickly hammer out the details. Seems like it would be faster than doing it in the ticket.

If we add a command line argument to generate_static_pages that automatically appends .html to all generated pages then that would make it significantly easier for me to manipulate them using gulp. e.g. gulp.src('dist/**/*.html'). I think that this would be enough for me (e.g. if I wanted to match production exactly then I could remove .html from all pages not named index).

comment:11 Changed 10 months ago by juliandoucette

  • Milestone set to Websites continuous integration

comment:12 Changed 5 months ago by juliandoucette

It seems like this will be more relevant now that we are moving towards a gulp build process on gitlab.

Note: See TracTickets for help on using tickets.