Opened on 12/02/2016 at 05:44:03 PM
Closed on 07/07/2017 at 01:27:19 PM
#4687 closed change (fixed)
[cms] Context function for retrieving pages metadata
Reported by: | juliandoucette | Assignee: | jsonesen |
---|---|---|---|
Priority: | P2 | Milestone: | |
Module: | Sitescripts | Keywords: | |
Cc: | saroyanm, kvas, jsonesen, philll | Blocked By: | |
Blocking: | #4847, #5043 | Platform: | Unknown / Cross platform |
Ready: | yes | Confidential: | no |
Tester: | Unknown | Verified working: | no |
Review URL(s): |
https://codereview.adblockplus.org/29370597/ |
Description (last modified by jsonesen)
Background
- The new Help Center website categorizes pages by taxonomy like [product, platform, category, tag],
- Several of the page templates required to build this website list pages based on these taxonomies,
- Therefore we must provide a method for adblockplus/cms to implement these lists using Jinja2.
What to change
Note: from here on we'll use the terms:
- field name to refer to each type of taxonomy (e.g. product, platform, etc.),
- field value to refer to the value of a field for a particular page (e.g. product=adblock-plus), and
- page metadata to refer to the collection of all the fields and values of one page (including the page field which contains the id of the page).
Provide the following @contextfunction:
def get_pages_metadata(filters={}): """Return metadata dicts for all pages matching the filters Args: filters: Mapping of field names to desired field values or lists of them Example: {'product': 'adblock-plus', 'platform': ['firefox', 'chrome']} Returns: A list of metadata dicts for pages matching all the filters (if no filters are given, returns all pages) Example: [{page: 'acceptable-ads', product: 'adblock-plus', platform: 'firefox'}] Raises: TypeError: if filters is not a dict mapping strings to strings or lists of strings """
Since this method is being used as a context function in order to have access to the mercurial source object, unit tests should be added to this object's api in order to ensure that it remains stable.
Attachments (0)
Change History (30)
comment:1 Changed on 12/02/2016 at 05:45:44 PM by juliandoucette
comment:6 Changed on 12/05/2016 at 02:25:22 PM by kvas
It seems to me that the current API is a bit complex: the taxonomies parameter is trying to do too many things at the same time. May I suggest the following change, which I think makes it more clear:
def index(taxonomies, filters=None): """Index pages by taxonomy. Args: taxonomies: A list of taxonomies List of strings: ["product", "platform"] filters: Filters that will be used to select pages Dictionary: {'product': 'adblock-plus'} Returns: A list of pages matching any provided filters including page name and given fields (in taxonomies) List of dicts: [{page: "acceptable-ads", product: "adblock-plus", platform: "firefox"}] Raises: TypeError: if taxonomies is not a list TypeError: if filters is not a dict """
What do you think?
comment:7 Changed on 12/05/2016 at 02:43:14 PM by jsonesen
I implemented the tuple requirement and in fact I think you have a point here, as there ends up being additional complexity in order to determine if the filter exists or not in the list and this is done using isinstance() and in fact heavy type checking like this is normally not considered very pythonic. So I would agree that adding a filters parameter probably makes more sense here, although I think that the suggestion Julian made regarding a filtering type behavior is good it does make a bit more sense to do less 'inference' with the code and just make it an argument so that there is not additional implementation details to be documented regarding our fancy use of a filter tuple
comment:8 Changed on 12/05/2016 at 04:23:41 PM by jsonesen
Another detail that has been left out is the case where the metadata section contains a comma separated list, i think handling this requires splitting the metadata string and then checking the filter. For example is you use the filter to get pages tagged 'popular' i need to split the tags' section and then check over that to make sure that 'popular' tag exists, simply saying if tag == 'popular': is not sufficient because as is if 'tag' has 'popular, cool, awesome' it is represented as {'tags': 'popular, cool, awesome'}
comment:9 follow-up: ↓ 10 Changed on 12/05/2016 at 04:43:58 PM by juliandoucette
It seems to me that the current API is a bit complex
I think my API is simpler to use.
I'll leave it to you guys to determine what is/isn't "pythonic" and/or too complex.
Another detail that has been left out is the case where the metadata section contains a comma separated list
Acknowledged.
We could:
- Allow regular expressions
- Allow a separate format EG: {property: "tags", contains: "popular"} vs {property: "category", value: "mobile"}
*whatever works* ;)
comment:10 in reply to: ↑ 9 Changed on 12/05/2016 at 05:18:15 PM by kvas
Replying to juliandoucette:
I think my API is simpler to use.
I'll leave it to you guys to determine what is/isn't "pythonic" and/or too complex.
Speaking of simplicity, I just thought, maybe we just alway return all the "taxonomies" for each page. I don't see how they'd bother the user and for implementation purposes we'd need to load the page metadata anyway and after that it's actually easier to return all of them rather than just some. Does this make sense?
Another detail that has been left out is the case where the metadata section contains a comma separated list
We could:
- Allow regular expressions
- Allow a separate format EG: {property: "tags", contains: "popular"} vs {property: "category", value: "mobile"}
*whatever works* ;)
What if we treat all of those things as essentially tags (each taxonomy would be a separate kind of tags) and then if a page has all the tags contained in the filter it matches. Otherwise it doesn't. So if we have, for example:
page1 = {'keywords': ['foo', 'bar'], 'category': 'spam'} page2 = {'keywords': 'bar', 'category': 'ham'}
then
index({'keywords': 'bar'}) # returns page1 and page2 index({'keywords': ['foo', 'bar']}) # returns page1 index({'keywords': 'bar', 'category': 'ham'}) # returns page2
Would this work for you, Julian?
comment:11 follow-up: ↓ 12 Changed on 12/05/2016 at 05:46:26 PM by juliandoucette
Speaking of simplicity, I just thought, maybe we just alway return all the "taxonomies" for each page. I don't see how they'd bother the user and for implementation purposes we'd need to load the page metadata anyway and after that it's actually easier to return all of them rather than just some. Does this make sense?
IIRC we decided to use a whitelist for a reason. Perhaps @jsonesen can elaborate and you two can determine what is best?
Would this work for you, Julian?
Yes.
comment:12 in reply to: ↑ 11 Changed on 12/05/2016 at 07:15:25 PM by kvas
- Description modified (diff)
Replying to juliandoucette:
IIRC we decided to use a whitelist for a reason. Perhaps @jsonesen can elaborate and you two can determine what is best?
Cool, will do.
Would this work for you, Julian?
Yes.
Great!
One more thing. The name index is a bit non-descriptive. What if we call it something like get_pages_metadata? I took the liberty to already in the ticket description but let me know if you disagree.
comment:13 Changed on 12/05/2016 at 08:00:40 PM by juliandoucette
One more thing. The name index is a bit non-descriptive. What if we call it something like get_pages_metadata? I took the liberty to already in the ticket description but let me know if you disagree.
@jsonesen's prototype function was named "indexes". That's where I got "index" from.
I have no objections to changing the name. Choose whatever you think fits best.
comment:14 Changed on 12/06/2016 at 05:08:31 PM by kvas
- Priority changed from Unknown to P2
- Ready set
- Summary changed from Provide global index function in adblockplus/cms to [cms] Context function for retrieving pages metadata
comment:15 Changed on 01/09/2017 at 08:17:24 AM by jsonesen
- Review URL(s) modified (diff)
comment:16 Changed on 01/09/2017 at 08:17:48 AM by jsonesen
- Owner set to jsonesen
comment:17 Changed on 01/30/2017 at 10:14:20 AM by jsonesen
- Description modified (diff)
comment:18 Changed on 02/09/2017 at 12:35:54 PM by saroyanm
Any rough estimation when this will be ready, pushed ?
I think that this might be needed for acceptableads.com/committee launch that we have a deadline until March. We probably will be able to make a use of this functionality.
comment:19 Changed on 02/09/2017 at 12:37:55 PM by saroyanm
- Blocking 4847 added
comment:20 follow-up: ↓ 21 Changed on 02/09/2017 at 12:40:06 PM by jsonesen
As discussed this may later be added as a default global however as of now that is not the case. Therefore, since the codereview has been given the LGTM by vasily it is ready now. So you can either download the patch set from the review or I would be happy to send the files via email. Additionally if there is a repo for the help center I would be happy to provide a patch set for it.
comment:21 in reply to: ↑ 20 Changed on 02/09/2017 at 12:49:07 PM by saroyanm
Replying to jsonesen:
As discussed this may later be added as a default global however as of now that is not the case. Therefore, since the codereview has been given the LGTM by vasily it is ready now. So you can either download the patch set from the review or I would be happy to send the files via email. Additionally if there is a repo for the help center I would be happy to provide a patch set for it.
I see, didn't noticed that it can already be applied to website repository, we don't have a repository for Help Center AFAIK, but as I mentioned we might need this for acceptableads.com/committee launch, Stakeholders would like to have a Blog functionality there, which probably will not make until the deadline, but I think we can make a use of this functionality and create a static Blog using it, I'll add you to all related ticket and let you know in case of a question, I'm going to test implementation myself later today, downloading the patch should be enough for now I guess.
comment:22 follow-up: ↓ 23 Changed on 02/09/2017 at 12:57:44 PM by jsonesen
Sounds good to me, we can use these tickets as a case for adding this as a default global. Additionally, there is example pages on that patch set and some tests which should provide an example of how to utilize the global, if you have any questions or would like to do a call on this then feel free to reach out
comment:23 in reply to: ↑ 22 Changed on 02/09/2017 at 12:59:31 PM by saroyanm
Replying to jsonesen:
Sounds good to me, we can use these tickets as a case for adding this as a default global. Additionally, there is example pages on that patch set and some tests which should provide an example of how to utilize the global, if you have any questions or would like to do a call on this then feel free to reach out
Awesome!
comment:24 Changed on 03/10/2017 at 12:00:37 PM by abpbot
A commit referencing this issue has landed:
Issue 4687 - Add Context Function get_pages_metadata to Test Site
comment:25 Changed on 03/11/2017 at 12:04:17 PM by kvas
Just to clarify, this was landed by mistake, the idea for this change was that it will be part of Help Center initially and then if we find it generally useful, it could migrate to the CMS as a globally available global. But the time hasn't come yet so for now I rolled that commit back.
comment:26 Changed on 06/21/2017 at 10:54:05 AM by kvas
- Blocking 5043 added
comment:27 Changed on 06/23/2017 at 10:09:56 AM by jsonesen
- Review URL(s) modified (diff)
comment:28 Changed on 06/23/2017 at 10:09:59 AM by jsonesen
- Status changed from new to reviewing
comment:29 Changed on 07/07/2017 at 11:09:28 AM by abpbot
A commit referencing this issue has landed:
Issue 4687 - Add global get_pages_metadata to template converters
comment:30 Changed on 07/07/2017 at 01:27:19 PM by kvas
- Resolution set to fixed
- Status changed from reviewing to closed
@jsonesen I added an extra requirement to the input (List of tuples which provides a default value to filter by) to the requirements. Please let me know if this is OK with you. We can simplify these requirements if needed.