-
Notifications
You must be signed in to change notification settings - Fork 32
Migrate from spansql to memefish #66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is no wrong implementation.
memefish itself doesn't target usecase as SQL builder, so I think dynamic SQL is OK.
I expects there is some children nodes are not handled, but it will be handled in issues and following PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Thank you for the changes to memefish. I just left comments on a few minor parts.
internal/hammer/diff.go
Outdated
m[stmt.Name] = t | ||
case *spansql.CreateIndex: | ||
if t, ok := m[stmt.Table]; ok { | ||
m[stmt.Name.SQL()] = t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI
(*ast.Path).SQL()
emits quoted identifier like
`group`.`table`
so it may be better to check len(path.Idents()) == 1
and add test cases with named schemas and names which is the same to keywords.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is expected after checking len(path.Idents()) == 1
?
I think the keys of m := make(map[string]*Table)
can include identifiers with named schemas, not just table names.
add test cases with named schemas and names which is the same to keywords.
I added test cases for named schema and table name that is keywords! (Although, CREATE SCHEME
is not yet supported with hammer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I want to say is that originally hammer did not support named schema,
but do you really want to support named schema in this memefish migration PR?
If it is broken, I recommend that you reject the general path for now and consider fixing it in a separate PR. If it is possible to do so, I think it is fine to do it in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a result, named schema may be supported, but it only means that the identifier is now treated as an identifier, even if the identifier is just the table name or has a named schema.
I don't know if the original spansql.ID
included named schema or not. However, I believe that the intention of the code here is not to use the table name, but to use the identifier. There may be differences in behaviour for named schema, but the intent seems to be consistent and backwards compatible.
We could intentionally remove support for named schema, but then we would have more code to remove named schema, including indexes and so on. IMO, I think that if the case of len(path.Idents()) == 3
were to appear in the future, we wouldn't have to exclude it here.
Anyway, I leave the final decision to @daichirata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI
m[stmt.Name.SQL()]
can be m["`group`.`order`"]
for CREATE TABLE `group`.`order`
(group
and order
are keyword.).
But currently, GetDatabaseDdl
will emit CREATE TABLE `group.order`
for that table definition so the map will contain m["`group.order`"]
.
https://issuetracker.google.com/issues/385901554
I believe this inconsistency can break diffs using spanner:
prefix of hammer.
I recommend to unquote path identifiers as likestrings.Join(path.Name.Idents[].Name, ".")
(pseudo code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the good point about named schema!
Therefore, there shouldn’t be any breaking changes until hammer supports the CREATE SCHEMA DDL, and we can wait for GetDatabaseDdl to be fixed before adding support for CREATE SCHEMA.
I agree with this opinion. Since the CREATE SCHEMA statement cannot currently be parsed, I believe this issue does not manifest for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key of this map is not required to be correct Cloud Spanner table name (or table identifier), but just an identifier of the table.
It means we can also use anything that is "same tables output the same, and different tables output different", like "<schema>.<table>"
, [2]string{"<schema>", "<table>"}
, or even "<schema>🔧<table>"
.
So, at least me, I don't care how actually SQL()
format the table names. I just expect SQL()
outputs the same string for the same table (and ASTs generally satisfies that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with both sides on many points. But I also thought that even if it is actually useful, the better case is that the table identity is not affected by the SQL character representation.
If you were to write code on the hammer side that performs joins each time using a map with Idents as keys, would the scope of changes become significant? @morikuni
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a commit to stop usingSQL()
for comparison: 18a5bd3
Revert it if not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay in reaction. Thanks for fixing!
go.mod
Outdated
|
||
require ( | ||
cloud.google.com/go/spanner v1.62.0 | ||
github.com/cloudspannerecosystem/memefish v0.0.0-20241219043423-1efca7ff9732 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is better to use released version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to memefish v0.3.1
@morikuni I tested it with actual data and noticed the following issue: When the order of an index or primary key is ASC, it is omitted from the DDL returned by the Spanner API. As a result, if the user explicitly specified ASC, a new difference is detected. Could you check this? {
from: `
CREATE TABLE t1 (
t1_1 INT64 NOT NULL,
t1_2 INT64 NOT NULL,
) PRIMARY KEY(t1_1, t1_2);
`,
to: `
CREATE TABLE t1 (
t1_1 INT64 NOT NULL,
t1_2 INT64 NOT NULL,
) PRIMARY KEY(t1_1, t1_2 ASC);
`,
expected: []string{},
},
{
from: `
CREATE TABLE t1 (
t1_1 INT64 NOT NULL,
t1_2 INT64 NOT NULL,
) PRIMARY KEY(t1_1, t1_2);
CREATE INDEX idx_t1 ON t1(t1_1, t1_2);
`,
to: `
CREATE TABLE t1 (
t1_1 INT64 NOT NULL,
t1_2 INT64 NOT NULL,
) PRIMARY KEY(t1_1, t1_2);
CREATE INDEX idx_t1 ON t1(t1_1, t1_2 ASC);
`,
expected: []string{},
},
|
@daichirata Fixed the problem you mentioned by considering index without direction is ASC. |
@morikuni Thank you for the prompt correction! There is a similar issue with the primary key, so could you please fix it in the same way? |
@daichirata Fixed also for primary key! |
@morikuni We're still running tests, but I'll proceed with the merge for now. Thanks! |
@apstndb Thanks for the reassuring support! |
fix: #65
There are some DDLs that cannot be parsed by spansql, so I migrated to memefish while trying to maintain as much compatibility as possible. Due to differences in SQL formatting between spansql and memefish, there are significant diffs in the outputs. However, I believe the functionality remains unchanged.
spqnsqlではパースできないDDLがあるので、できるだけ互換性を保ったままmemefishに移行した。spansqlとmemefishのSQLのフォーマットの違いにより、diffに差分が大きく出ているが、動作は変わっていないと思う。