+
Skip to content

parsing: option to [end] to terminate parsing even if there is further input #684

@wezm

Description

@wezm

I use time in my rsspls project (thanks!). It's a tool that uses CSS selectors to extract parts of web pages and build an RSS feed from them. time is used for parsing dates that will become the published date of the RSS item. In wezm/rsspls#46 the element in the HTML that contains the date actually has two dates in it like this:

<td><td tabindex="0" role="cell" class="periodo-pubblicazione date">31/05/2024<br>  15/06/2024</td>

Which is "31/05/2024 15/06/2024" when extracted. We'd like to be able to parse the first date. This is similar in nature to #471 but my idea is to add a modifier to the end component that allows it to be used even when all the input has not been consumed. This would allow using a format description like [day padding:zero]/[month padding:zero]/[year][end eof:false]

I'd be open to implementing this if it seems reasonable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-parsingArea: parsingC-feature-requestCategory: a new feature (not already implemented)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载