+
Skip to content

Conversation

austintlee
Copy link

No description provided.

result = []
# We only fetch the minimum required fields for full document retrieval/reconstruction
if query_params.reconstruct_document:
query_params.kwargs["_source_includes"] = "doc_id,parent_id,properties"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked into only fetching "parent_id", but the problem with only getting child documents here is that we also need parent docs (which do not have "parent_id") and their properties as during explode() we do not transfer all properties from parent to child docs:

for doc_property in parent.properties.keys():
if doc_property.startswith("_"):
cur.properties[doc_property] = parent.properties[doc_property]

Otherwise, we would have to make another round of calls to OpenSearch just to fetch parent docs which may be less efficient and more time consuming than just fetching doc_id, parent_id and properties here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why you need properties. At least in my testing; I was getting parent ids in the _id field.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not need to reconstruct all of the properties? If that is the case, I think we can just reconstruct using only the child elements.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the test I ran to verify correctness:

def test_ingest_and_read(self, setup_index, exec_mode):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载