-
Notifications
You must be signed in to change notification settings - Fork 1
Optimizing translate large result set #72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There's also a stylist improvement I'd like to make while working at the code -- I think using a combination of keywords and structs (e.g. |
9da271b
to
7bfbec1
Compare
It turns out fully portable CL can achieve similar performance! I pushed a new commit with In fact I think maybe we can rewrite everything (that involves |
Looks like I'm inclined to specify it only when we're definitely on the last chunk, i.e. |
In this branch, I collect the results of
Thanks! And take your time :) I'll try to figure out the offending part as well. Update: fixed! Update 2: I removed |
bb299ab
to
02db3ee
Compare
Congrats on being the first contributor to this library! It proves at least some people can decipher my CL code ;) I've now tried your changes and the performance increase is baffling. I'm embarrassed how inefficient the previous version was, even if it was still pretty fast :) The only thing I've noticed is that (type-of (ddb:get-result (duckdb:q "SELECT true::boolean AS a") 'a))
(SIMPLE-VECTOR 1) A bit vector would probably be an efficient representation, but it would break compatibility as it is using I'm merging this now since it doesn't break anything, we can sort out booleans later if needed. Great work overall! |
Thanks! I'm happy it helps :)
Indeed, and that's intended. The CL standard doesn't define what kind of specialized Do you have any thoughts on the following?
|
Sorry I've missed your question.
I think users can fix that in their query if they want, pretty easy to do via |
Hi! This is a draft of my take at #67. It takes 1.56s for the
(progn (setq *test* (time (ddb:q "FROM read_csv('~/Downloads/8ysW.csv')"))) nil)
benchmark on SBCL! With the memcpy methods disabled, it runs in 5s.This is currently just a proof of concept, and I'd like to discuss the interface:
*features*
instead? It pass all test cases in its current form (i.e. enable specialized array by default) anyways, so I guess not much code really rely on unspecialized array?*sql-null-return-value*
, should there be some option to prevent falling back to unspecialized array? (e.g. fill in -1 for uint, and fill in NaN for floats)?Besides, I don't use other CL implementation but I expect the unboxed translators to work on most CL implementation. Feel free to test it and add to the
#+sbcl
block!