Opened 4 years ago

Closed 4 years ago

#2951 closed defect (fixed)

abpcrawler output filenames all start with "None" instead of domain

Reported by: philll Assignee: trev
Priority: P4 Milestone:
Module: Extensions-for-Adblock-Plus Keywords:
Cc: Blocked By:
Blocking: Platform: Firefox
Ready: yes Confidential: no
Tester: Unknown Verified working: no
Review URL(s):

https://codereview.adblockplus.org/29324625/

Description

Environment

Debian Jessie
abpcrawler 387:0d9a4db7d073
Firefox 40

How to reproduce

  1. Execute the crawler with the attached urls.txt file

Observed behaviour

All output file names start with "None".

Expected behaviour

All output file names should start with the respective domain.

Attachments (1)

urls.txt (819 bytes) - added by philll 4 years ago.

Download all attachments as: .zip

Change History (6)

Changed 4 years ago by philll

comment:1 Changed 4 years ago by philll

  • Resolution set to invalid
  • Status changed from new to closed

The URL list in use didn't specific full URLs but host names only.

comment:2 Changed 4 years ago by trev

  • Resolution invalid deleted
  • Status changed from closed to reopened

I looked a bit more into this and the problem is that we open the URLs in Firefox, yet determining the host name happens in Python. Firefox tries various things to get a "proper" URL, Python won't. Ideally, we would use the same logic in both cases, accessing the Firefox logic isn't really feasible however. So I think adding the scheme if there is none should be good enough as solution.

comment:3 Changed 4 years ago by trev

  • Owner set to trev

comment:4 Changed 4 years ago by trev

  • Component changed from Unknown to Extensions-for-Adblock-Plus
  • Platform changed from Unknown / Cross platform to Firefox
  • Priority changed from Unknown to P4
  • Ready set
  • Review URL(s) modified (diff)
  • Status changed from reopened to reviewing

comment:5 Changed 4 years ago by trev

  • Resolution set to fixed
  • Status changed from reviewing to closed
Note: See TracTickets for help on using tickets.