这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@jjhelmus
Copy link
Contributor

@jjhelmus jjhelmus commented Dec 5, 2018

Create pyc files in a single pass rather than using multiple calls to py_compile.

closes #7621

add compile_multiple_pyc function to the gateways/disk/create module

Function compiles multiple pyc file in a single call to python.
Add MultiPathAction, PrefixMultiPathAction, CreateInPrefixMultiPathAction and
CompileMultiPycAction classes
@jjhelmus jjhelmus requested a review from a team as a code owner December 5, 2018 22:02
@jjhelmus
Copy link
Contributor Author

jjhelmus commented Dec 5, 2018

@kalefranz We discussed this yesterday, would appreciate your feedback on these changes.

@jjhelmus jjhelmus changed the title WIP: Single pass pyc creation Single pass pyc creation Dec 7, 2018
@msarahan
Copy link
Contributor

msarahan commented Dec 7, 2018

Seems reasonable to me. Are the existing tests sufficient? Is it worthwhile to put a time limit test in that conservatively passes with the new code, but which would fail with the old code?

@jjhelmus
Copy link
Contributor Author

jjhelmus commented Dec 7, 2018

Seems reasonable to me. Are the existing tests sufficient? Is it worthwhile to put a time limit test in that conservatively passes with the new code, but which would fail with the old code?

A few tests are needed to replace the use of CompilePycAction. done

I'd also need to test the compile_multiple_pyc function on Windows to see if there is a size limit on stdin in that platform.

@mingwandroid
Copy link
Contributor

stdin is a pipe on Windows so works much the same, no limit.

Make sure py_full_paths and pyc_full_paths can be interated over multiple times
by casting them to tuples.  Add an early end to the function if no py file
needs to be compiled.
Special case the compilation of pyc on windows to account for the size
limitation of stdin and the maximum length of the command line arguments.

pyc compilation routine on windows borrowed from conda-build's post.py module.
@jjhelmus
Copy link
Contributor Author

jjhelmus commented Dec 7, 2018

The integration tests for pyc compiling on Windows are failing because the .py file has not been linked into the environment when the compilation step is called:

DEBUG conda.gateways.subprocess:subprocess_call(48): executing>> "C:\Users\appveyor\AppData\Local\Temp\1\45cf ������\python.exe" -Wi -m py_compile -
INFO conda.gateways.disk.create:compile_multiple_pyc(367): pyc file failed to compile successfully
python_exe_full_path: C:\Users\appveyor\AppData\Local\Temp\1\45cf ������\python.exe
py_full_path: C:\Users\appveyor\AppData\Local\Temp\1\45cf ������\Lib\site-packages\itsdangerous.py
pyc_full_path: C:\Users\appveyor\AppData\Local\Temp\1\45cf ������\Lib\site-packages\__pycache__\itsdangerous.cpython-37.pyc
compile rc: 1
compile stdout: 
compile stderr: [Errno 2] No such file or directory: 'C:\\Users\\appveyor\\AppData\\Local\\Temp\\1\\45cf ������\\Lib\\site-packages\\itsdangerous.py'

The order of the actions may need to be specified so that the pyc compilation always occurs after the linking.

@msarahan
Copy link
Contributor

msarahan commented Dec 7, 2018

Ordering actions seems wise. Please go ahead with that idea.

@jjhelmus
Copy link
Contributor Author

jjhelmus commented Dec 7, 2018

I think the unicode characters in the prefix name and path were causing issues on Windows. Hopefully escaping them and using the proper linesep fixes the issue.

@mingwandroid
Copy link
Contributor

You can call compile with an argument for each filename too AFAIK.

@jjhelmus
Copy link
Contributor Author

jjhelmus commented Dec 8, 2018

The filesystem encoding may be different than the encoding used for stdin. The documentation for sys.stdin, which is what py_compile is reading from, indicates that when the stream is not interactive, stdin uses a "ANSI code page" encoding. I think this can be determined from locale.getpreferredencoding(), but I'd preferred to interrogate the interpreter itself to determine the encoding.

Determine the encoding for stdin by calling python and extracting the results
from the output. Use this encoding to encode the list of Python files to
compile to pyc files.
@jjhelmus
Copy link
Contributor Author

jjhelmus commented Dec 8, 2018

You can call compile with an argument for each filename too AFAIK.

You need to limit the length of the arguments in this case. e96463c implemented this by calling py_compile multiple times and is a good option if the correct encoding for stdin cannot be determined reliably.

@msarahan
Copy link
Contributor

msarahan commented Dec 8, 2018

failing on spaces. Trying to figure out proper quoting.

@jjhelmus
Copy link
Contributor Author

jjhelmus commented Dec 8, 2018

The scheme used in b3cb857 will fail to compile .py files if the paths use characters which cannot be represented in the stdin character encoding. For example on windows a path may have UTF-8 characters which are not present in the cp1252 code page. This would be very unusual and the old scheme may have also failed in these cases.

@jjhelmus
Copy link
Contributor Author

jjhelmus commented Dec 8, 2018

Using compileall has the potential to compile a lot of unnecessary python files which will then be removed. For a large environment this would happen for each noarch package installed. Is this a performance concern?

@mingwandroid
Copy link
Contributor

mingwandroid commented Dec 8, 2018

You can call compile with a list of files as arguments, just remove the - from Jonathan's command line and add the files..

@msarahan
Copy link
Contributor

msarahan commented Dec 9, 2018

Yeah, that's why I put in that search for extra pyc files. I have no idea what the performance implication is without some sort of benchmark. It's pointed solely at site-packages, which I think dramatically limits potential problems.

Ray, I know that you can compile with a list of arguments. The trouble is that that list of arguments is subject to this hell of encoding nonsense. I'm trying to find ways to skirt that hell.

@mingwandroid
Copy link
Contributor

How about emit to a file as utf-8 a line per file, then write a simple script that reads that back in?

@jjhelmus
Copy link
Contributor Author

jjhelmus commented Dec 9, 2018

How about emit to a file as utf-8 a line per file, then write a simple script that reads that back in?

compileall can take a file containing a list of file as input via the -i argument. The question is where to store said file?

@msarahan
Copy link
Contributor

msarahan commented Dec 9, 2018

doesn't matter - some tempdir?

@mingwandroid
Copy link
Contributor

I've heard of pair programming but this is even better! Even if I'm just in the backseat.

# -j 0 will do the compilation in parallel, with os.cpu_count() cores
if int(py_ver[0]) >= 3 and int(py_ver.split('.')[1]) > 5:
command.extend(["-j", "0"])
command = ["-Wi", "-m", "compileall", "-q", "-i", tf.name]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-l might be prudent

@msarahan
Copy link
Contributor

msarahan commented Dec 9, 2018

Sorry @jjhelmus I'm done for the day on this. I think we're back where we started, but hopefully we know a bit more now.

@msarahan msarahan merged commit 761ee60 into conda:master Dec 9, 2018
@github-actions
Copy link

Hi there, thank you for your contribution to Conda!

This pull request has been automatically locked since it has not had recent activity after it was closed.

Please open a new issue or pull request if needed.

@github-actions github-actions bot added the locked [bot] locked due to inactivity label Aug 31, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 31, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

locked [bot] locked due to inactivity

Projects

None yet

Development

Successfully merging this pull request may close these issues.

allow for multi-path PathActions; build .pyc files in one pass per package

3 participants