+
Skip to content
This repository was archived by the owner on Sep 9, 2020. It is now read-only.

Conversation

MyklClason
Copy link
Contributor

Incomplete. I've done most of the work to setup complete filename matching for linting. Figured I'd put it up here for discussion before I finish it. Only a single line of the original code is changed. Though the final version should be more integrated. Code should be fully backwards compatible, just a little slower.

Filenames are now broken into a list of possible keys for which linters may be assigned rather than being limited to only extensions. A best match search is done from left to right to determine which linter to use.

Rules/Expectations:
    filename         => List of possible linter keys
    .gitlint         => ['.gitlint']
    travis.yml       => ['travis.yml', '.yml']
    phpunit.xml.dist => ['phpunit.xml.dist', '.xml.dist', '.dist', '.xml']
    .b.c.d           => ['.b.c.d', '.c.d', '.d', '.b.c', '.c', '.b']
    a.b.c.d          => ['a.b.c.d', '.b.c.d', '.c.d', '.d', '.b.c', '.c', '.b']

To do:

  • Add tests cases. I'm not sure how to go about adding test cases for this with what you've got setup.
  • Improve variable/function names. The names I used were more to make it clear what was being done differently than being accurate with the terminology.

@sk-
Copy link
Owner

sk- commented Mar 8, 2016

Thanks @MyklClason for start working on this.

Instead of reusing the extension field and picking the best extension I would an extra field called path_matches that is a regex that should match the filename (doing some normalization for path separators).

I think this is more general and does not abuse the extension concept.

@MyklClason
Copy link
Contributor Author

As for the extension abuse. I completely agree it's abuse. I wasn't sure on what to call it offhand. I'll use these names instead.

_best_match_extension => _best_match
_possible_extensions  => _possible_matches
ext(s)                => match(es)

My regex is weak so I was unable to create a regex that could duplicate the functionality of _possible_extensions (_possible_matches). That being said, it does seem trivial to use config directly to auto-build a regex in the format of (.html|.xml|pattern|...). For now, I'll just implement a basic version that won't break any tests. Full integration will require a bit of re-writing as it'll require switching fully from extensions to file names. Including various changes to the config file and linters.py. For now I've left it as using the "best match" approach rather than using multiple sets of linters.

Note:
This method may become problematic down the road as the number of linters increase due to regex build times. Memoization will help some, but may not be enough.

@MyklClason
Copy link
Contributor Author

Regex worked out rather well, but ended up having to write a priority function to get the expected results. I'm also not sure the regex is efficient.

Edit: I'm not sure why this is failing. Seems like json.json is valid and should be parsed as .json.
Edit2: Seems like the test data might be bad if nonewerror.json is supposed to be error-free as GitHub and the linter shows it having an error. Though it's odd that there's an error in Python 2.7 only.

@MyklClason
Copy link
Contributor Author

Also out of curiosity, why are you using import gitlint.utils as utils instead of simply import utils?

I ask because I end up commenting out that line whenever I'm testing this. Here is what I'm using (rather simple approach). Before when I was working with Python, I normally don't use the terminal, but using Cloud9 as my IDE these days.

root/gitlint $ python -c 'import linters; print linters._best_match("a",{".a":1,".b":2,".c":3,".b.c":5,"a":0})'

@sk-
Copy link
Owner

sk- commented Mar 9, 2016

I meant something different.

Instead of reutilizing the extension field in the configuration, I would add an extra field called path_matches, which is a regex.

Then when parsing the config instead of returning a dict that maps from extension to linter you return a Config object, which has a method called get_linters(filename).

Then in the lint function instead of for linter in config[ext]: you do for linter in config.get_linters(filename):.

For example if you want to run a linter for travis.yml files you would have:

travis:
  path_matches: "/\\.travis\\.yml$"
  ...

@MyklClason
Copy link
Contributor Author

Ah yes. I had similar thoughts while I was working on it, in terms of how to adjust the config. I was actually thinking of using something exactly like config.get_linters(filename), but it seems like that should get all linters, thus I didn't use it since I was still in the best match mindset.

As a potential issue, it brings up the question of things like, "What if there's say a config.js file that uses a specialized version of javascript? That method results in using both config.js and .js, which may not be good." Prioritization of the results of the regexes seems to solve that, perhaps with a command to switch between best match linter and all matches linter. That and to do it right, I'd have to make a number of changes that break tests, which I may or may not be able to fix. Such as changing the below error message to something like no linter defined or enabled for filename, "%s" % filename, but I could probably handle that much.

'skipped': ['no linter is defined or enabled for files'
                            ' with extension "%s"' % ext]

For backwards compatibility and perhaps simplification, it might be good to still use extensions. Essentially, if path_matches isn't provided, then an extension matching regex will be generated from the extensions.

Still though, seems good to move the get_linters aspect to the config file. I'll do that in about a day.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载