+
Skip to content

Implementing skip_blank option #37

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Aug 27, 2022
Merged

Conversation

aroberge
Copy link
Contributor

I've implemented the skip_blank option by adding a new line type, and adding a new Option.
Here's an example where the default (skipping blank lines) is used:

Traceback (most recent call last):
 File "C:\Users\Andre\github\stack_data\example.py", line 13, in <module>
       7 |     length = len(seq)
       9 |     return seq[length]
      12 | seq = [1, 2, 3]
-->   13 | last = last_item(seq)
                  ^^^^^^^^^^^^^^
 File "C:\Users\Andre\github\stack_data\example.py", line 9, in last_item
       5 | def last_item(seq):
       7 |     length = len(seq)
-->    9 |     return seq[length]
                      ^^^^^^^^^^^
IndexError: list index out of range

And here's an example where skip_blank is set to False:

Traceback (most recent call last):
 File "C:\Users\Andre\github\stack_data\example.py", line 14, in <module>
       7 |     length = len(seq)
       : |
      10 |     return seq[length]
       : |
      13 | seq = [1, 2, 3]
-->   14 | last = last_item(seq)
                  ^^^^^^^^^^^^^^
 File "C:\Users\Andre\github\stack_data\example.py", line 10, in last_item
       5 | def last_item(seq):
       6 |
       7 |     length = len(seq)
       : |
-->   10 |     return seq[length]
                      ^^^^^^^^^^^
IndexError: list index out of range

@aroberge
Copy link
Contributor Author

aroberge commented Aug 14, 2022

Here's a different possibility when multiple lines are skipped:

Traceback (most recent call last):
 File "C:\Users\Andre\github\stack_data\example.py", line 14, in <module>
       7 |     length = len(seq)
       :
      10 |     return seq[length]
       :
      13 | seq = [1, 2, 3]
-->   14 | last = last_item(seq)
                  ^^^^^^^^^^^^^^
 File "C:\Users\Andre\github\stack_data\example.py", line 10, in last_item
       5 | def last_item(seq):
       6 |
       7 |     length = len(seq)
       :
-->   10 |     return seq[length]
                      ^^^^^^^^^^^
IndexError: list index out of range

I admit that it is a bit of a contrived examples, as there are rarely two blank lines in the code, except before functions and class definitions. Still, I have a slight preference for this one, using only : and not : |, as it is less cluttered and shows better the gap -- I think.

Edit:

Note that I have not made a change to the committed code to reflect this last example.

@@ -521,6 +534,10 @@ def __init__(
self.code = frame.f_code
self.options = options or Options() # type: Options
self.source = self.executing.source # type: Source
if hasattr(self.options, "skip_blank"):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to account for self.options being the wrong type. I also don't think we need a FrameInfo.skip_blank attribute/property anyway - self.options.skip_blank seems fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, for not checking that self.options has the correct type. If I use an enum, or something else with three options for skip_blank, I will have to do a test of some kind.

This comment was marked as resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I clicked on resolved by accident and forgot about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should now have been taken care of.

@alexmojaki
Copy link
Owner

Still, I have a slight preference for this one, using only : and not : |, as it is less cluttered and shows better the gap -- I think.

You could add a Formatter keyword argument that requires a string which will be formatted in place and right-aligned to line up with the | from normal lines, so ": " (that's two spaces at the end) and ": |" would both be allowed and satisfy both your ideas. Some might also want to use a proper unicode vertical ellipsis with 3 dots.

@aroberge
Copy link
Contributor Author

You could add a Formatter keyword argument that requires a string which will be formatted in place and right-aligned to line up with the | from normal lines, so ": " (that's two spaces at the end) and ": |" would both be allowed and satisfy both your ideas. Some might also want to use a proper unicode vertical ellipsis with 3 dots.
Ok. Since this symbol would precede an empty line, there would b no need to add spaces after ":".

@alexmojaki
Copy link
Owner

btw skip_blank might not be a good name if there are three modes.

@aroberge
Copy link
Contributor Author

While everything seems to work correctly, more improvements to the code needs to be done. I will try to finish it tomorrow.

@aroberge
Copy link
Contributor Author

aroberge commented Aug 16, 2022

Other than missing unit tests, I believe that this is now ready. To illustrate the effect of each option, I will use the following example which includes many blank lines:

import stack_data
stack_data.Formatter().set_hook()

def last_item(seq):

    length = len(seq)


    return seq[

        length
    ]


seq = [1, 2, 3]
last = last_item(seq)

Default output

The default is the current behaviour of stack_data, with a small cosmetic fix explained near the end of this comment.

Traceback (most recent call last):
 File "C:\Users\Andre\github\stack_data\example.py", line 16, in <module>
       6 |     length = len(seq)
       9 |     return seq[
      10 |
      11 |         length
      12 |     ]
      15 | seq = [1, 2, 3]
-->   16 | last = last_item(seq)
                  ^^^^^^^^^^^^^^
 File "C:\Users\Andre\github\stack_data\example.py", line 9, in last_item
       4 | def last_item(seq):
       6 |     length = len(seq)
-->    9 |     return seq[
                      ^^^^
      10 |
      11 |         length
               ^^^^^^^^^^
      12 |     ]
               ^
IndexError: list index out of range

Making empty lines "visible" as text.

Setting the following parameters:

options = stack_data.Options(blank_lines=stack_data.BlankLines.VISIBLE)
stack_data.Formatter(options=options).set_hook()

The result is as follows:

Traceback (most recent call last):
 File "C:\Users\Andre\github\stack_data\example.py", line 17, in <module>
       7 |     length = len(seq)
       8 |
       9 |
      10 |     return seq[
      11 |
      12 |         length
      13 |     ]
      14 |
      15 |
      16 | seq = [1, 2, 3]
-->   17 | last = last_item(seq)
                  ^^^^^^^^^^^^^^
 File "C:\Users\Andre\github\stack_data\example.py", line 10, in last_item
       5 | def last_item(seq):
       6 |
       7 |     length = len(seq)
       8 |
       9 |
-->   10 |     return seq[
                      ^^^^
      11 |
      12 |         length
               ^^^^^^^^^^
      13 |     ]
               ^
IndexError: list index out of range

Using only line number "markers" to indicate blank lines.

Setting the following parameters:

options = stack_data.Options(blank_lines=stack_data.BlankLines.LINE_NUMBER)
stack_data.Formatter(options=options).set_hook()

gives the following result (notice that : is used to indicate more than one blank line being skipped).

Traceback (most recent call last):
 File "C:\Users\Andre\github\stack_data\example.py", line 17, in <module>
       7 |     length = len(seq)
       :
      10 |     return seq[
      11 |
      12 |         length
      13 |     ]
       :
      16 | seq = [1, 2, 3]
-->   17 | last = last_item(seq)
                  ^^^^^^^^^^^^^^
 File "C:\Users\Andre\github\stack_data\example.py", line 10, in last_item
       5 | def last_item(seq):
       6 |
       7 |     length = len(seq)
       :
-->   10 |     return seq[
                      ^^^^
      11 |
      12 |         length
               ^^^^^^^^^^
      13 |     ]
               ^
IndexError: list index out of range

Setting a different string for blank lines that are "skipped".

Using the following parameters

options = stack_data.Options(blank_lines=stack_data.BlankLines.LINE_NUMBER)
stack_data.Formatter(options=options, line_number_gap_string=": |").set_hook()

gives the following result:

Traceback (most recent call last):
 File "C:\Users\Andre\github\stack_data\example.py", line 17, in <module>
       7 |     length = len(seq)
       : |
      10 |     return seq[
      11 |
      12 |         length
      13 |     ]
       : |
      16 | seq = [1, 2, 3]
-->   17 | last = last_item(seq)
                  ^^^^^^^^^^^^^^
 File "C:\Users\Andre\github\stack_data\example.py", line 10, in last_item
       5 | def last_item(seq):
       6 |
       7 |     length = len(seq)
       : |
-->   10 |     return seq[
                      ^^^^
      11 |
      12 |         length
               ^^^^^^^^^^
      13 |     ]
               ^
IndexError: list index out of range

Added explanation for a comment.

Using the default values, here's what happen if I were to remove the following code in range_from_node:

        if range_start == range_end == 0:
            # This is an empty line. If it were included, it would result
            # in a value of zero for the common indentation assigned to
            # a block of code.
            return None

Here is the result:

Traceback (most recent call last):
 File "C:\Users\Andre\github\stack_data\example.py", line 16, in <module>
       6 |     length = len(seq)
       9 |     return seq[
      10 |
      11 |         length
      12 |     ]
      15 | seq = [1, 2, 3]
-->   16 | last = last_item(seq)
                  ^^^^^^^^^^^^^^
 File "C:\Users\Andre\github\stack_data\example.py", line 9, in last_item
       4 | def last_item(seq):
       6 |     length = len(seq)
-->    9 |     return seq[
                      ^^^^
      10 |
      11 |         length
           ^^^^^^^^^^^^^^
      12 |     ]
           ^^^^^
IndexError: list index out of range

Notice how, due to the blank line 10 in the marked executing piece, the markers on lines 11 and 12 extend all the way back to the first column.

Minor cosmetic bug fix

In Formatter.format_line, I now have the following:

                # if end <= start, we have an empty line inside a highlighted
                # block of code. In this case, we need to avoid inserting
                # an extra blank line with no markers present.
                if end > start:
                    result += (
                            " " * (start + len(prefix))
                            + self.executing_node_underline * (end - start)
                            + "\n"
                    )

If the end > start test is removed, this is what happens:

Traceback (most recent call last):
 File "C:\Users\Andre\github\stack_data\example.py", line 16, in <module>
       6 |     length = len(seq)
       9 |     return seq[
      10 |
      11 |         length
      12 |     ]
      15 | seq = [1, 2, 3]
-->   16 | last = last_item(seq)
                  ^^^^^^^^^^^^^^
 File "C:\Users\Andre\github\stack_data\example.py", line 9, in last_item
       4 | def last_item(seq):
       6 |     length = len(seq)
-->    9 |     return seq[
                      ^^^^
      10 |

      11 |         length
               ^^^^^^^^^^
      12 |     ]
               ^
IndexError: list index out of range

Notice the gap between lines 10 and 11 in the marked piece.

@aroberge aroberge requested a review from alexmojaki August 16, 2022 11:08
Copy link
Owner

@alexmojaki alexmojaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is looking pretty good. My points are minor, the biggest remaining thing is the tests.

@@ -521,6 +534,10 @@ def __init__(
self.code = frame.f_code
self.options = options or Options() # type: Options
self.source = self.executing.source # type: Source
if hasattr(self.options, "skip_blank"):

This comment was marked as resolved.

@aroberge
Copy link
Contributor Author

The code coverage has slightly decreased due to:

  • The addition of a new ValueError exception that could be raised if Options is not initialized correctly.
  • The absence of testing for the BlankLines.SINGLE when an exception is not raised.

Are unit tests needed for these two cases?

@alexmojaki
Copy link
Owner

The code coverage has slightly decreased due to:

  • The addition of a new ValueError exception that could be raised if Options is not initialized correctly.
  • The absence of testing for the BlankLines.SINGLE when an exception is not raised.

Are unit tests needed for these two cases?

The first point seems trivial to test. Just construct a bad formatter and assert the raised exception. No need to do any formatting.

I don't know what the second point means.

@@ -168,7 +179,7 @@ def format_line(self, line: Line) -> str:
result += " "

if self.show_linenos:
result += "{:4} | ".format(line.lineno)
result += self.line_number_format_string.format(line.lineno)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this new option? It seems unrelated to the rest of the PR. And shouldn't it be used in format_blank_lines_linenumbers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it as the indentation was fixed and I might want a larger value in friendly_traceback; it is the only hard-coded value you had in stack_data, while all the others could be changed by the users. And, yes, I overlooked the fact that it should have been used in format_blank_lines_linenumbers as well. I will need to do this.

if self.current_line_indicator:
result = " " * len(self.current_line_indicator)
else:
result = " "
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://coveralls.io/builds/51814297/source?filename=stack_data%2Fformatting.py#L214 points out that this line is uncovered, and it seems worth testing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I probably won't have time to do this tonight though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had added this else part based on the presence of the second term in

result = result or " "

However, it turns out that result was never null and that this line was redundant in the original code, based on all existing tests. So I removed the corresponding else part of my addition.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's pretty easy to see when the line isn't redundant in the previous code: when you set show_linenos=False, current_line_indicator="". result won't be null, it'll be an empty string. Defaulting it to spaces ensures that the user's code is indented like in a standard traceback. I guess it should have been tested before, but coverage didn't see the gap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want me to reinstate it and write a relevant test for it?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there's been some confusion or misremembering. The test needs to specifically have both the arguments show_linenos=False, current_line_indicator="" in one formatter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was wondering why the line result = result or " " did not seem to have any effect. I will fix this.

@aroberge
Copy link
Contributor Author

  • The absence of testing for the BlankLines.SINGLE when an exception is not raised.

Are unit tests needed for these two cases?

The first point seems trivial to test. Just construct a bad formatter and assert the raised exception. No need to do any formatting.

I don't know what the second point means.
I'm thinking of something like print_stack when many blank lines are printed. I didn't try it out to confirm that it would show changes. I'll need to do some explicit tests to confirm that a unit test might be needed.

Comment on lines 119 to 124
try:
MyFormatter(show_linenos=False, options=Options(blank_lines=BlankLines.SINGLE))
except ValueError:
assert True
else:
assert False
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
try:
MyFormatter(show_linenos=False, options=Options(blank_lines=BlankLines.SINGLE))
except ValueError:
assert True
else:
assert False
with pytest.raises(ValueError):
MyFormatter(show_linenos=False, options=Options(blank_lines=BlankLines.SINGLE))

and an import pytest at the top.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

@@ -166,12 +177,13 @@ def format_line(self, line: Line) -> str:
else:
result = " " * len(self.current_line_indicator)
result += " "
else:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be removed, the original line result = result or " " is enough. A test without current_line_indicator but with show_linenos would confirm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are, of course, correct. I had a brain freeze when I added this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested this by creating two additional tests (with or without blank lines included). Do you want these extra tests added in a commit?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you've already written the test etc. then might as well. But no need for extra effort. Just remove these two lines, unless the tests show otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the two lines, the result is as follows:

Traceback (most recent call last):
 File "formatter_example.py", line 85, in blank_lines
     79 | def blank_lines():
     80 |     a = [1, 2, 3]
     82 |     length = len(a)
     85 |     return a[length]
                     ^^^^^^^^^
IndexError: list index out of range

Without them, the indentation is slightly less

Traceback (most recent call last):
 File "formatter_example.py", line 85, in blank_lines
  79 | def blank_lines():
  80 |     a = [1, 2, 3]
  82 |     length = len(a)
  85 |     return a[length]
                  ^^^^^^^^^
IndexError: list index out of range

Which one do you prefer? Once decided, it probably would make sense to include the tests.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it looks nicer with the extra spaces, I didn't expect that. Especially if the line numbers reach 4 digits. So let's go with the former option, and then I think result = result or " " isn't needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed to modified the code when using blank_lines.HIDDEN with no current line marker so that it would be consistent with the others. I added three tests, one for each setting.

Copy link
Owner

@alexmojaki alexmojaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good, thanks! Happy for me to merge?

@aroberge
Copy link
Contributor Author

Yes, please do!

@alexmojaki alexmojaki merged commit 0b69516 into alexmojaki:master Aug 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载