WO2025128636A1 - Computer-implemented system and method for maintaining data uniformity across independent databases - Google Patents
Computer-implemented system and method for maintaining data uniformity across independent databases Download PDFInfo
- Publication number
- WO2025128636A1 WO2025128636A1 PCT/US2024/059476 US2024059476W WO2025128636A1 WO 2025128636 A1 WO2025128636 A1 WO 2025128636A1 US 2024059476 W US2024059476 W US 2024059476W WO 2025128636 A1 WO2025128636 A1 WO 2025128636A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- databases
- datasets
- independent groups
- computer
- database
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2365—Ensuring data consistency and integrity
Definitions
- the present invention relates to ensuring data consistency across corresponding datasets stored within independent databases.
- Redundant data is problematic, particularly in closed or integrated systems. Users may experience data discrepancies when unknowingly connecting to different databases. System failure may occur when the system does not receive an expected response. Data may become corrupted as partial data is pulled from separate sources and aggregated in ways it should not be. For example, extracting a person’s affiliation and title from one database and his work address from a second database.
- Redundant data is difficult to maintain. Between databases, data can be stored in different formats, different data structures, and with varying access. Over time, it can become impossible to determine which data is the most current as specific data points on independent databases are updated at varying times and through independent methods. [0006] Due to the complexity of maintaining redundant data, standard practice is to eliminate data redundancies whenever practical, particularly in closed systems that can’t risk a user encountering data discrepancies.
- the present invention comprehends a computer-implemented system operative to maintain uniform data among datasets across two or more independent groups of databases, wherein each of said independent groups of databases includes at least one database having datasets.
- the system comprises one or more central databases (“CBDs”) comprising at least datasets corresponding with the datasets among the independent groups of databases, and at least one processor operative to: (i) identify the corresponding datasets; (ii) map to at least one master record a format and one or more CDB locations the corresponding datasets; (iii) query an incremental database record created when a user modifies one or more datasets in one of the independent groups of databases (“the modified dataset(s)”) so as to identify the modifications made by the user; (iv) query the at least one master record to identify the one or more CDB locations of all datasets corresponding to the modified dataset(s); and (v) synchronize among the one or more central database locations and the two or more independent groups of databases the modified dataset(s) and the datasets corresponding to the
- the dataset profiles within each at least one master record are unique to that master record.
- the two or more independent groups of databases are each accessible through an application programming interface gateway.
- the application programming interface gateway updates data fields in the modified one or more datasets identifying the modifying user and the name(s) of the modified one or more datasets.
- the incremental database record contains a history of data insertions, deletions, and edits, and a timestamp of the modifications.
- an application server monitor queries each incremental database record for modifications.
- the application server monitor generates a timestamped record of each query of each incremental database record, and the one or more central database locations are synchronized if the timestamp of the modifications in the incremental database has a later timestamp than the timestamped record of the last query by the application server monitor.
- the incremental database record is automatically created when a user modifies one or more datasets in one of the independent groups of databases.
- each database in the two or more independent groups of databases is assigned a priority group to prioritize synchronization of modifications when simultaneous modifications to corresponding datasets are made by separate users in the independent groups of databases.
- modification by a privileged user to one or more datasets in the one or more CDBs effects synchronization of the corresponding datasets in the two or more independent groups of databases.
- FIG. 2 is a detailed flow diagram illustrating the update/synchronization process within the system as exemplified in FIG. 1.
- FIG. 3 is a flow diagram further illustrating the update/synchronization process within the system.
- the present invention provides a data synchronization scheme across independent databases where a user can be restricted to a single database within a system yet have their updates automatically implemented across multiple independent databases.
- the invention provides, by way of non-limiting example, the following advantages: The user can be restricted to a single database yet have his updates reflected in databases that they cannot access; independent databases designed to capture the same information and work in concert will avoid data discrepancies and the system failures that result from data discrepancies; independent databases can remain independent and be removed from the system without impairing the operation of those systems that previously relied upon the removed databases.
- the present invention is a computer-implemented system operative to maintain uniform data among datasets across one or more central databases and two or more independent groups of databases, wherein each of the independent groups of databases comprises at least one database.
- the system comprises at least one processor operative to: (i) identify corresponding datasets between the two or more independent groups of databases; (ii) map to at least one master record a format and one or more central database locations of the corresponding datasets; (iii) query an incremental database record created when a user modifies one or more datasets in one of the independent groups of databases (“the modified dataset(s)”) so as to identify the modifications made by the user; (iv) query the at least one master record to identify the one or more central database locations of all datasets corresponding to the modified dataset(s); and (v) synchronize the modified dataset(s) and the datasets corresponding to the modified dataset(s) among the one or more central database locations and in the two or more independent groups of databases so as to reflect the modifications made by the user.
- the system as described herein can be employed in an event logistics platform for the life science industry.
- each user e.g., a pharmaceutical company
- the present invention avoids the conventional need to support each third party participant through a customized version of the platform; i.e., a unique codebase for each participant.
- the present invention enables users to share a common codebase and also to implement independent rules which are captured within each of these users’ respective database(s) but enforced through the common codebase.
- the common codebase may require senior manager approval for expense authorizations while the threshold value is defined within the independent database of each separate senior manager’s organization.
- the common codebase expands to support more data driven custom izations, users can increasingly rely upon a shared codebase to support their business needs while remaining confident only specified data would be shared across organizational units (i.e., with users outside of their own organization).
- the present invention lends itself also to utilization of artificial intelligence (Al). More specifically, it will be appreciated that Al can be utilized to implement local modifications to datasets, the creation of new datasets and databases, and the like.
- business rules engine that, in the exemplary embodiment of an event logistics platform for the life science industry, may include, without limitation, the following features:
- Import Zip Codes The capacity to upload a file to align each sales area to one or more zip codes.
- Business Requirements by Meeting Type Create custom meeting types and business rules such as, by way of non-limiting example, virtual and live audiences, virtual and live speakers, audience type, default budgets, who can request and approve, minimum attendance, venue requirements, minimal lead time, etc.
- Workflow by Meeting Type For example, speaker confirmation, travel logistics, event logistics, invitations, expense and document capture, attendee profiles and signatures, etc.
- Speaker Selection Criteria Business rules such as speaker contracting, training, travel, utilization caps, etc.
- Document Capture The ability to create a list of required documents by meeting type and then identify each file or image capture as one of the required document types.
- Agency Assignment For example, Company A can choose to self-manage or appoint the ABC Agency to manage events. If Company A is unhappy with ABC, they can transfer management to the XYZ Agency. Transferring agencies will update workflows, support information, communication templates, and other items that are unique to each agency.
- business rules are simply exemplary of a software application that may be utilized in connection with the database synchronization system of the present invention. Numerous other applications and, optionally, rules associated with those applications, may be superimposed on the system of the present invention.
- FIG. 1 depicts a first example of the present invention where first and second independent groups of databases, each including at least one database, in the form of DATABASE A 12 and DATABASE B 22 are associated with a single CDB 30.
- DATABASES A and B (exemplifying the independent groups of databases each including at least one database) are each referred to in this example a "local” databases and their users, “USER 1” 10 and “USER 2” 20, as "local” users.
- Each local database 12, 22 is accessible to its respective local user 10, 20 through an application programming interface (API) Gateway 11 or 21.
- API application programming interface
- Each local database 12, 22 and the at least one CDB 30 comprise datasets comprising one or more data fields.
- the system comprises at least one Master Record (MR) 40.
- the master record 40 is a data file or database record that lists the databases’ (e.g., 12, 22) names and locations within their respective databases (e.g., 12, 22).
- a dataset profile consists of the data fields that comprise the dataset, their locations (for instance, table and column within an SQL database) across all databases 12, 22, etc., data format within each database (e.g., 12, 22), and the CDB (e.g., 30 or, in the case of multiple CDBs, another CDB) for that dataset.
- the dataset profile therefore maintains a one-to-one data alignment between datasets.
- Data format refers to how the data is structured within a database (e.g., 12, 22). For example, an ID number could be stored as a numerical or text value; a person’s job specialty could be stored as a text field or as a numerical value that references a specialty table, an address could be stored as a numerical and a text value, etc. Of course, these are only exemplary forms of data and not intended to be limiting of the claimed invention.
- the dataset lists and profiles within MR 40 are unique to that MR, just as the participating users’ (e.g., 10, 20) independent groups of databases (e.g., local databases 12, 22) will be unique to them.
- CDB(s) 30 and MR 40 Access to the CDB(s) 30 and MR 40 is restricted to designated personnel, such as system administrators ("privileged users").
- a CDB 30 is a database that contains the primary reference for a specific dataset. Only one CDB can be indicated as the primary reference for any dataset.
- IDB Incremental Data Backup
- an application server (Monitor) 50 continuously queries each IDB (e.g., 13, 23) for an update indicating a modification to any database.
- the Monitor 50 generates a timestamped record of each query. If an update within an IDB (e.g., 13, 23) has a later timestamp than the prior Monitor query, the Monitor 50 initiates an update process.
- an IDB e.g. 13, 23
- the update process consists of the Monitor 50 identifying the data fields that were updated in an IDB and then querying the MR to identify which datasets were impacted and what are the CBDs for each of those datasets.
- updates to data in “Fields” identified by arrows are shown.
- the API Gateway 11 in the example of FIG. 2 will update in the IDB a "Last Edited By" data field with data identifying the user (USER 1 in the example of FIG. 2) and will populate a "Source” data field with the name of the modified database (in this case, DATABASE A).
- Monitor 50 copies those updated datasets from, in this example, IDB A 12 to the CDB 30 through an API Gateway CD 60 to the CDB 30.
- Monitor 50 performs the same continuous querying on IDB 31 as described above in regards to Monitor 50 for the IDBs A and B 13, 23 of the local databases 12, 22.
- Monitor 50 queries the MR 40 to determine the location and data format of the datasets on all databases and then copies the datasets to the relevant local databases (in the example, database 22) through API Gateway CD 60, as indicated by the arrows A in FIG. 1 .
- API Gateway CD 60 A dataset that is the source for a database update to the CDB 30 does not then get overwritten by the by API Gateway CD 60. More specifically according to the illustrated example, when data is synchronized from the CDB 30 to any local databases that were not the source of the modification, API Gateway CD 60 will overwrite the "Last Edited By" and "Source Values" to be, respectively, CDB 30 and the source database (e.g., DATABASE A or DATABASE B).
- source database e.g., DATABASE A or DATABASE B
- An IDB (e.g., 13) does not trigger an update to the CDB 30 as described in above when the "Last Edited By" value is the CDB.
- An update is triggered only when the “Last Edited By” value is other than the CDB.
- the APIs used to reference the MR 40 and update databases do not have edit rights to the MR 40.
- FIG. 3 a scenario comprehended by the present invention is when the same dataset across two or more local databases is updated simultaneously by separate local users on their respective independent local databases. As shown in FIG. 3,
- each database is assigned a priority group; e.g., Group A, designated “Priority 1 ,” comprising databases A, C, and E; and Group B, designated “Priority 2,” comprising databases B, D, and F.
- a dataset cannot exist on more than one database within a priority group.
- the Monitor 50 (not shown in FIG. 3) assigns priority to the dataset from the higher priority group in determining which dataset to synchronize.
- Priority may be assigned to a priority group through a combination of manual assignment from a privileged user or via an algorithm appropriate to the particular implementation of the invention.
- a privileged user can assign “Priority 1” to a first group of databases and “Priority 2” to a second group of databases, where the databases in the “Priority 1 ” group have an earlier creation date than those of the “Priority 2” group.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A computer-implemented system operative to maintain uniform data among datasets across two or more independent groups of databases, wherein each of the independent groups of databases includes at least one database having datasets. The system includes one or more central databases having at least datasets corresponding with the datasets among the independent groups of databases. A processor is operative to: identify the corresponding datasets; map to at least one master record a data format and one or more central database locations of the corresponding datasets; query an incremental database record created when a user modifies one or more datasets in one of the independent groups of databases ("the modified dataset(s)") so as to identify the modifications made by the user; query the at least one master record to identify the one or more central database locations of all of the datasets corresponding to the modified dataset(s); and synchronize among the one or more central database locations and the two or more independent groups of databases the modified dataset(s) and the datasets corresponding to the modified dataset(s) so as to reflect the modifications made by the user.
Description
Title
COMPUTER-IMPLEMENTED SYSTEM AND METHOD FOR MAINTAINING DATA UNIFORMITY ACROSS INDEPENDENT DATABASES
Cross-Reference to Related Applications
[0001] The present application is related to, and claims the benefit of priority from, United States Provisional Application Serial No. 63/608,358 filed 11 December 2023, the disclosure of which is incorporated herein by reference in its entirety.
Field of the Invention
[0002] The present invention relates to ensuring data consistency across corresponding datasets stored within independent databases.
Background
[0003] Complex data systems often contain redundant data, either by accident, by design, or through the history of joining two systems. For example, a large company employing thousands of software developers may have two groups of developers unaware that another group is using the same data. However, once the databases are structured and the code is written, it may be easier to maintain the two datasets rather than adopt a single database and edit the software.
[0004] Redundant data is problematic, particularly in closed or integrated systems. Users may experience data discrepancies when unknowingly connecting to different databases. System failure may occur when the system does not receive an expected response. Data may become corrupted as partial data is pulled from separate sources and aggregated in
ways it should not be. For example, extracting a person’s affiliation and title from one database and his work address from a second database.
[0005] Redundant data is difficult to maintain. Between databases, data can be stored in different formats, different data structures, and with varying access. Over time, it can become impossible to determine which data is the most current as specific data points on independent databases are updated at varying times and through independent methods. [0006] Due to the complexity of maintaining redundant data, standard practice is to eliminate data redundancies whenever practical, particularly in closed systems that can’t risk a user encountering data discrepancies.
[0007] However, certain scenarios would benefit from having redundant data. For example, in an online marketplace, an independent organization can generally export data only using the tools provided by the marketplace. To provide the organization an independent database that could be fully copied or even severed from the marketplace without disrupting the marketplace would require a solution supporting redundant data.
[0008] Another scenario is where a corporation has numerous business units operating independently. The corporation wishes the data systems for each business unit to be fully integrated to maximize efficiencies. However, the corporation also wishes to maintain the independence of each business unit to maximize the potential sale value of that unit. For a business unit to have a database that is fully independent yet also fully integrated with the parent database requires a solution that supports data redundancy.
Summary
[0009] Embodiments of the present invention which are described herein address some or all of the above-described issues.
[0010] The system described herein utilizes a process to ensure that data modifications, such as insertions, deletions, and edits, are propagated across all databases that have corresponding datasets for the affected data. A secondary process then validates the data consistency and updates when necessary. This validation process protects against database changes that circumvent the described system. For example, if a database administrator manually updates a database through the backend, the changes would still be captured and mirrored to the other databases.
[0011] In one embodiment, the present invention comprehends a computer-implemented system operative to maintain uniform data among datasets across two or more independent groups of databases, wherein each of said independent groups of databases includes at least one database having datasets. The system comprises one or more central databases (“CBDs”) comprising at least datasets corresponding with the datasets among the independent groups of databases, and at least one processor operative to: (i) identify the corresponding datasets; (ii) map to at least one master record a format and one or more CDB locations the corresponding datasets; (iii) query an incremental database record created when a user modifies one or more datasets in one of the independent groups of databases (“the modified dataset(s)”) so as to identify the modifications made by the user; (iv) query the at least one master record to identify the one or more CDB locations of all datasets corresponding to the modified dataset(s); and (v) synchronize among the one or more central database locations and the two or more
independent groups of databases the modified dataset(s) and the datasets corresponding to the modified dataset(s) so as to reflect the modifications made by the user.
[0012] Per one aspect, each of the two or more independent groups of databases are accessible to different users, and the one or more central database locations and the at least one master record are not accessible to any users of the two or more independent groups of databases.
[0013] Per another feature, each at least one master record comprises a list of one or more datasets and each of the one or more datasets' profiles. Each dataset profile includes data fields comprising the dataset, the locations of the data fields across all of the independent groups of databases, a data format within each of the independent groups of databases, and the one or more CDB locations for that dataset.
[0014] In another aspect, the dataset profiles within each at least one master record are unique to that master record.
[0015] Per a further feature, the two or more independent groups of databases are each accessible through an application programming interface gateway. When a user modifies one or more datasets in one of the independent groups of databases, the application programming interface gateway updates data fields in the modified one or more datasets identifying the modifying user and the name(s) of the modified one or more datasets.
[0016] According to yet another aspect, the incremental database record contains a history of data insertions, deletions, and edits, and a timestamp of the modifications.
[0017] Per a further feature, an application server monitor queries each incremental database record for modifications. The application server monitor generates a timestamped record of each query of each incremental database record, and the one or
more central database locations are synchronized if the timestamp of the modifications in the incremental database has a later timestamp than the timestamped record of the last query by the application server monitor.
[0018] According to yet another feature, the incremental database record is automatically created when a user modifies one or more datasets in one of the independent groups of databases.
[0019] Per a further feature, each database in the two or more independent groups of databases is assigned a priority group to prioritize synchronization of modifications when simultaneous modifications to corresponding datasets are made by separate users in the independent groups of databases.
[0020] Per another feature, modification by a privileged user to one or more datasets in the one or more CDBs effects synchronization of the corresponding datasets in the two or more independent groups of databases.
Description of the Drawings
[0021] FIG. 1 is a flow diagram illustrating a system according to the present invention comprising independent, local users with independent, local databases with a single CDB, where aligned datasets are concurrently updated to maintain data uniformity.
[0022] FIG. 2 is a detailed flow diagram illustrating the update/synchronization process within the system as exemplified in FIG. 1.
[0022] FIG. 3 is a flow diagram further illustrating the update/synchronization process within the system.
[0023] FIG. 4 is a block diagram illustrating database groups within a priority hierarchy.
[0024] FIG. 5 is a diagram illustrating an embodiment of the present invention.
Written Description
[0025] The present invention provides a data synchronization scheme across independent databases where a user can be restricted to a single database within a system yet have their updates automatically implemented across multiple independent databases. The invention provides, by way of non-limiting example, the following advantages: The user can be restricted to a single database yet have his updates reflected in databases that they cannot access; independent databases designed to capture the same information and work in concert will avoid data discrepancies and the system failures that result from data discrepancies; independent databases can remain independent and be removed from the system without impairing the operation of those systems that previously relied upon the removed databases.
[0026] In general, the present invention is a computer-implemented system operative to maintain uniform data among datasets across one or more central databases and two or more independent groups of databases, wherein each of the independent groups of databases comprises at least one database. The system comprises at least one processor operative to: (i) identify corresponding datasets between the two or more independent groups of databases; (ii) map to at least one master record a format and one or more central database locations of the corresponding datasets; (iii) query an incremental database record created when a user modifies one or more datasets in one of the independent groups of databases (“the modified dataset(s)”) so as to identify the modifications made by the user; (iv) query the at least one master record to identify the one or more central database locations of all datasets corresponding to the modified
dataset(s); and (v) synchronize the modified dataset(s) and the datasets corresponding to the modified dataset(s) among the one or more central database locations and in the two or more independent groups of databases so as to reflect the modifications made by the user.
[0027] Several exemplary embodiments of this system and method are described in more detail below in reference to the accompanying drawings.
[0028] In one exemplary embodiment, the system as described herein can be employed in an event logistics platform for the life science industry. According to this embodiment, each user (e.g., a pharmaceutical company) has its own independent group of one or more databases for relevant data, which data serves as the source for updates to a CDB from which platform operations are run using a shared codebase. By the foregoing, the present invention avoids the conventional need to support each third party participant through a customized version of the platform; i.e., a unique codebase for each participant. [0029] The present invention enables users to share a common codebase and also to implement independent rules which are captured within each of these users’ respective database(s) but enforced through the common codebase. For example and without limitation, and continuing the example where the present invention is utilized in an event logistics platform, the common codebase may require senior manager approval for expense authorizations while the threshold value is defined within the independent database of each separate senior manager’s organization. As the common codebase expands to support more data driven custom izations, users can increasingly rely upon a shared codebase to support their business needs while remaining confident only specified
data would be shared across organizational units (i.e., with users outside of their own organization).
[0030] By facilitating utilization of a common codebase across the system through the use of synchronized databases (thereby obviating the need to create specialized code for each local user), the present invention lends itself also to utilization of artificial intelligence (Al). More specifically, it will be appreciated that Al can be utilized to implement local modifications to datasets, the creation of new datasets and databases, and the like.
[0031] Implementation of the aforementioned rules, also referred to as “business rules” in the exemplary embodiment, is facilitated by a business rules engine that, in the exemplary embodiment of an event logistics platform for the life science industry, may include, without limitation, the following features:
[0032] Organizational Hierarchy. An online tool to build a national salesforce including sales areas, reporting structure, budget alignment, user types and labels, user permissions and views, etc.
[0033] Import Salesforce. Once the organizational hierarchy is defined, the ability to upload a file to populate the sales areas and staffing assignments.
[0034] Import Zip Codes. The capacity to upload a file to align each sales area to one or more zip codes.
[0035] Business Requirements by Meeting Type. Create custom meeting types and business rules such as, by way of non-limiting example, virtual and live audiences, virtual and live speakers, audience type, default budgets, who can request and approve, minimum attendance, venue requirements, minimal lead time, etc.
[0036] Workflow by Meeting Type. For example, speaker confirmation, travel logistics, event logistics, invitations, expense and document capture, attendee profiles and signatures, etc.
[0037] Speaker Selection Criteria. Business rules such as speaker contracting, training, travel, utilization caps, etc.
[0038] Communication Templates. Emails, electronic files, push notifications, and texts to vendors, staff, faculty, and attendees aligned by presentation and meeting type.
[0039] Attendee Restrictions. Set restrictions for who is permitted to attend. For example, only doctors who have not seen this presentation (regardless of version) within the past 12 months.
[0040] Funding Source. Dynamically generate or import virtual credit cards for each event/expense type/beneficiary to both track and impose business controls. The credit cards are generated from: Client Upload; Client Funded Account; Agency Funded Account; System Provider-Funded Account.
[0041] Budgetary Rules and Allocations. For Example: Budget by Dollars vs. Budget by Events; Budget by Meeting Type; Budget Allocation to All Areas at All Levels; Access to Unallocated Funds; Restricting Funds by Quarter or Year; No Fixed Budget.
[0042] Faculty Payments. Set conditions and approval system to process an instant payment for Fee for Service to Faculty.
[0043] Expense Form Submission and Review. A system of image capture and expense alignment, compliance check, and approval process to initiate an instant payment to Faculty for expense reimbursement.
[0044] Default Field Values. A process for onboarding new organizations with default values that can be edited as needed. For Example: Specialties; Sub Specialties; Degrees; Professions.
[0045] Document Capture. The ability to create a list of required documents by meeting type and then identify each file or image capture as one of the required document types. [0046] Agency Assignment. For example, Company A can choose to self-manage or appoint the ABC Agency to manage events. If Company A is unhappy with ABC, they can transfer management to the XYZ Agency. Transferring agencies will update workflows, support information, communication templates, and other items that are unique to each agency.
[0047] Program Fee Milestones. Align event management fee milestones and percent of fee earned to tasks completed within the workflow.
[0048] Of course, it will be understood that the foregoing “business rules” are simply exemplary of a software application that may be utilized in connection with the database synchronization system of the present invention. Numerous other applications and, optionally, rules associated with those applications, may be superimposed on the system of the present invention.
[0049] FIG. 1 depicts a first example of the present invention where first and second independent groups of databases, each including at least one database, in the form of DATABASE A 12 and DATABASE B 22 are associated with a single CDB 30. DATABASES A and B (exemplifying the independent groups of databases each including at least one database) are each referred to in this example a "local" databases and their users, “USER 1” 10 and “USER 2” 20, as "local" users.
[0050] Each local database 12, 22 is accessible to its respective local user 10, 20 through an application programming interface (API) Gateway 11 or 21. Each API facilitates implementation of the business rules engine described above in connection with the exemplary application of the present invention.
[0051] Each local database 12, 22 and the at least one CDB 30 comprise datasets comprising one or more data fields.
[0052] The system comprises at least one Master Record (MR) 40. The master record 40 is a data file or database record that lists the databases’ (e.g., 12, 22) names and locations within their respective databases (e.g., 12, 22).
[0053] Within a MR 40 is also a list of datasets and the datasets’ profiles. A dataset profile consists of the data fields that comprise the dataset, their locations (for instance, table and column within an SQL database) across all databases 12, 22, etc., data format within each database (e.g., 12, 22), and the CDB (e.g., 30 or, in the case of multiple CDBs, another CDB) for that dataset. The dataset profile therefore maintains a one-to-one data alignment between datasets.
[0054] “Data format” refers to how the data is structured within a database (e.g., 12, 22). For example, an ID number could be stored as a numerical or text value; a person’s job specialty could be stored as a text field or as a numerical value that references a specialty table, an address could be stored as a numerical and a text value, etc. Of course, these are only exemplary forms of data and not intended to be limiting of the claimed invention. [0055] The dataset lists and profiles within MR 40 are unique to that MR, just as the participating users’ (e.g., 10, 20) independent groups of databases (e.g., local databases 12, 22) will be unique to them. Consider, by way of example, a scenario where users
(e.g., 10, 20) includes companies A, B, and C. Their respective independent groups of one or more databases (e.g., 12, 22 in the example of FIG. 1 ) are unique to those users in structure and content, though the data itself may overlap with data in other users’ independent groups of one or more databases (or, even where the data does not overlap, the updating of that data to the CDB may be needed to facilitate the provision of correct output to another user where the output is based at least in part on that information, such as, by way of non-limiting example, where the system utilizes address data in a database to facilitate providing another user with geolocation information for the address data through the software application of which the present invention is a part). Consequently, the MR 40 catalogs how each user’s independent database is structured so as to facilitate synchronization of any modifications.
[0056] Security to the integrity of the process is provided by the APIs 11 , 21. Local users
10, 20 do not have direct access to the at least one CDB 30, the MR 40, or any local database e.g., 12, 22 aside from their local database(s) that they access through the APIs
11, 21. Access to the CDB(s) 30 and MR 40 is restricted to designated personnel, such as system administrators ("privileged users").
[0057] A CDB 30 is a database that contains the primary reference for a specific dataset. Only one CDB can be indicated as the primary reference for any dataset.
[0058] When the local user (whether “USER 1” 10 or “USER 2” 20 in the illustrated example) makes a modification to their local database (DATABASE A or B, respectively, for USER 1 or USER 2), the system automatically generates an Incremental Data Backup (IDB) 13 or 23 for that local database 12 or 22. This is shown in FIG. 1 as IDB A 13 (for DATABASE A) and IDB B 23 (for DATABASE B).
[0059] An IDB (e.g., 13, 23) is a common feature of cloud-based databases. All data changes are captured in the IDBs, thereby creating an audit history that indicates what data was updated, when, why, and how. The IDB contains a history of data insertions, deletions, and edits, including a timestamp of the updates.
[0060] More specifically, and in accordance with the exemplary embodiment, when a local user (e.g., USER 1 ) updates to their local database (e.g., DATABASE A) the API Gateway (e.g., 11) will update in the IDB a "Last Edited By" data field with data identifying the user and will populate a "Source" data field with the name of the modified database (in this case, DATABASE A). See FIG. 2.
[0061] Still referring to FIG. 1 , using a cron scheduled process, i.e., a background process executing non-interactive tasks, an application server (Monitor) 50 continuously queries each IDB (e.g., 13, 23) for an update indicating a modification to any database.
[0062] The Monitor 50 generates a timestamped record of each query. If an update within an IDB (e.g., 13, 23) has a later timestamp than the prior Monitor query, the Monitor 50 initiates an update process.
[0063] Referring also to FIG. 2, the update process consists of the Monitor 50 identifying the data fields that were updated in an IDB and then querying the MR to identify which datasets were impacted and what are the CBDs for each of those datasets. In the illustrated example, updates to data in “Fields” identified by arrows are shown. As noted above, when a local user (USER 1 in the example of FIG. 2) updates data in their local database (DATABASE A) the API Gateway (11 in the example of FIG. 2) will update in the IDB a "Last Edited By" data field with data identifying the user (USER 1 in the example of
FIG. 2) and will populate a "Source" data field with the name of the modified database (in this case, DATABASE A).
[0064] In the example of FIGS. 1 and 2 there is only one CDB 30, so Monitor 50 copies those updated datasets from, in this example, IDB A 12 to the CDB 30 through an API Gateway CD 60 to the CDB 30.
[0065] As shown in FIGS. 1 and 2, the update to the CDB 30 creates an IDB 31 for the CDB.
[0066] Monitor 50 performs the same continuous querying on IDB 31 as described above in regards to Monitor 50 for the IDBs A and B 13, 23 of the local databases 12, 22. When an update has been identified, Monitor 50 queries the MR 40 to determine the location and data format of the datasets on all databases and then copies the datasets to the relevant local databases (in the example, database 22) through API Gateway CD 60, as indicated by the arrows A in FIG. 1 .
[0067] A dataset that is the source for a database update to the CDB 30 does not then get overwritten by the by API Gateway CD 60. More specifically according to the illustrated example, when data is synchronized from the CDB 30 to any local databases that were not the source of the modification, API Gateway CD 60 will overwrite the "Last Edited By" and "Source Values" to be, respectively, CDB 30 and the source database (e.g., DATABASE A or DATABASE B).
[0068] An IDB (e.g., 13) does not trigger an update to the CDB 30 as described in above when the "Last Edited By" value is the CDB. An update is triggered only when the “Last Edited By” value is other than the CDB.
[0069] The APIs used to reference the MR 40 and update databases do not have edit rights to the MR 40.
[0070] Notwithstanding the number of corresponding datasets or the number of local independent databases (e.g., 12, 22), the process described herein will ensure corresponding data will remain consistent across all databases.
[0071] Referring now to FIG. 3, a scenario comprehended by the present invention is when the same dataset across two or more local databases is updated simultaneously by separate local users on their respective independent local databases. As shown in FIG.
3, each database is assigned a priority group; e.g., Group A, designated "Priority 1 ," comprising databases A, C, and E; and Group B, designated "Priority 2," comprising databases B, D, and F. A dataset cannot exist on more than one database within a priority group. In this situation, the Monitor 50 (not shown in FIG. 3) assigns priority to the dataset from the higher priority group in determining which dataset to synchronize.
[0072] Priority may be assigned to a priority group through a combination of manual assignment from a privileged user or via an algorithm appropriate to the particular implementation of the invention. By way of non-limiting example, a privileged user can assign “Priority 1” to a first group of databases and “Priority 2” to a second group of databases, where the databases in the “Priority 1 ” group have an earlier creation date than those of the “Priority 2” group.
[0073] Referring next to FIG. 4, implementation of the synchronization process described above is illustrated in a decision flowchart. As previously explained, Monitor
50 queries all IDBs (e.g., 13, 23) at step 100 to determine if any dataset was updated by a local user since the last query. If not, then Monitor 50 queries continue, as shown at
step 101. If “yes,” then the updated records are retrieved from the dataset updated by a local user since the time of the last query 102 and sorted by timestamp and group priority (if applicable) 103. Synchronization of the data is then effected through the API Gateway (e.g., 61) by execution of a write command 104 which effects a query of the MR 40, as shown at step 105, to determine the location of the affected data, whereafter the matching data is updated across affected databases 106. Upon completion of the update, the Monitor queries whether additional updates are required 107.
[0074] One scenario comprehended by the present invention is when updates to the CDB(s) are made directly to the CDB, rather than via modifications to a local database. In such a situation, an IDB will still be created and trigger updates to the other (local) databases where the datasets are present, such as in the manner described above in connection with modifications to a local database.
[0075] When a privileged user modifies one or more datasets in the one or more CDBs, such as altering data formats, for instance, synchronization of the corresponding datasets in the two or more independent groups of databases is effected only after earlier modifications to the same one or more datasets by any users in the independent groups of databases have been synchronized. This hierarchy prevents a local user’s earlier modifications from overwriting the privileged user’s modifications.
[0076] In general, techniques for maintaining consistency between data structures may be implemented with facilities consistent with any hardware system or hardware systems defined herein. Many variations, modifications, additions, and improvements are possible. Plural instances may be provided for components, operations, or structures described herein as a single instance. Finally, boundaries between various components, operations,
and data stores are somewhat arbitrary, and particular operations are illustrated in the context of specific illustrative configurations. Other allocations of functionality are possible and may fall within a scope of various embodiments of the present disclosure. In general, structures and functionality presented as separate components in the exemplary configurations may be implemented as a combined structure or component. Similarly, structures and functionality presented as a single component may be implemented as separate components. These and other variations, modifications, additions, and improvements fall within a scope of embodiments of the present disclosure as represented by the appended claims.
[0077] FIG. 5 is a schematic representation of an exemplary environment wherein the independent groups of databases 12, 22 and the CDB 30 are provided in a cloud computing environment (designated at box 200), with local users 10, 12 access to their respective local databases 12, 22 facilitated by local computers C1, C2 and the at least one processor for implementing the system in the manner described above being provided in the form of the depicted computing device C3. The separate nature of users' databases relative to each other and to the CDB is represented by the dashed boxes 210, 211. Of course, other arrangements are possible. For instance and without limitation, the independent groups of databases 12, 22 could reside in separate cloud computing environments or in local, networked servers. The only limitation in these regards is that the independent groups of databases 12, 22 are accessible to permit synchronization of data across the system as described elsewhere herein.
[0078] It is to be understood that the disclosed embodiments represent presently preferred examples of how to make and use the invention, but are intended to enable
rather than limit the invention. Variations and modifications of the illustrated examples in the foregoing written specification and drawings may be possible without departing from the scope of the invention.
[0079] It should further be understood that to the extent the term “invention” is used in the written specification, it is not to be construed as a limiting term as to number of claimed or disclosed inventions or discoveries or the scope of any such invention or discovery, but as a term which has long been used to describe new and useful improvements in science and the useful arts. The scope of the invention supported by the above disclosure should accordingly be construed within the scope of what it teaches and suggests to those skilled in the art, and within the scope of any claims that the above disclosure supports in this provisional application or in any non-provisional application claiming priority to this provisional application.
[0080] The claims are representative of the invention and are not intended to limit the claimed invention with respect to other features which are supported by or might become apparent from the description, and which might be claimed subsequently.
Claims
1. A computer-implemented system operative to maintain uniform data among datasets across two or more independent groups of databases, wherein each of said independent groups of databases includes at least one database having datasets, the system comprising: one or more central databases comprising at least datasets corresponding with the datasets among the independent groups of databases; and at least one processor operative to: identify the corresponding datasets; map to at least one master record a data format and one or more central database locations of the corresponding datasets; query an incremental database record created when a user modifies one or more datasets in one of the independent groups of databases (“the modified dataset(s)”) so as to identify the modifications made by the user; query the at least one master record to identify the one or more central database locations of all of the datasets corresponding to the modified dataset(s); and synchronize among the one or more central database locations and the two or more independent groups of databases the modified dataset(s) and the datasets corresponding to the modified dataset(s) so as to reflect the modifications made by the user.
2. The computer-implemented system of claim 1 , wherein each of the two or more independent groups of databases are accessible to different users, and wherein further the one or more central database locations and the at least one master record are not accessible to any users of the two or more independent groups of databases.
3. The computer-implemented system of claim 1 , wherein each at least one master record comprises a list of one or more datasets and each of the one or more datasets' profiles, wherein each dataset profile includes data fields comprising the dataset, the locations of the data fields across all of the independent groups of databases, a data format within each of the independent groups of databases, and the central database location for that dataset.
4. The computer-implemented system of claim 3, wherein the dataset profiles within each at least one master record are unique to that master record.
5. The computer-implemented system of claim 5, wherein, when a user modifies one or more datasets in one of the independent groups of databases, an application programming interface gateway updates data fields in the modified one or more datasets identifying at least the modifying user and the name(s) of the modified one or more datasets.
6. The computer-implemented system of claim 1 , wherein the incremental database record contains a history of data insertions, deletions, and edits, and a timestamp of the modifications.
7. The computer-implemented system of claim 1 , wherein an application server monitor queries each incremental database record for modifications.
8. The computer-implemented system of claim 7, wherein the application server monitor generates a timestamped record of each query of each incremental database record, and wherein further the one or more central database locations are synchronized if the timestamp of the modifications in the incremental database has a later timestamp than the timestamped record of the last query by the application server monitor.
9. The computer-implemented system of claim 1 , wherein the incremental database record is automatically created when a user modifies one or more datasets in one of the independent groups of databases.
10. The computer-implemented system of claim 1 , wherein further each database in the two or more independent groups of databases is assigned a priority group to prioritize synchronization of modifications when simultaneous modifications to corresponding datasets are made by separate users in the independent groups of databases.
11 . The computer-implemented system of claim 1 , wherein modification by a privileged user to one or more datasets in the one or more central databases effects synchronization of the corresponding datasets in the two or more independent groups of databases only after earlier modifications to the same one or more datasets by any users in the independent groups of databases.
12. A computer-implemented method, comprising: providing two or more independent groups of databases, wherein each of said independent groups of databases comprises at least one database having datasets, wherein each of the two or more independent groups of databases are accessible to different users; providing one or more central databases comprising at least datasets corresponding with the datasets among the independent groups of databases, wherein the one or more central databases are accessible to one or more privileged users but not accessible to any users of the two or more independent groups of databases; providing at least one processor operative to: identify the corresponding datasets; map to at least one master record a data format and one or more central database locations of the corresponding datasets; query an incremental database record created when a user modifies one or more datasets in one of the independent groups of databases (“the modified dataset(s)”) so as to identify the modifications made by the user;
query the at least one master record to identify the one or more central database locations of all of the datasets corresponding to the modified dataset(s); and synchronize among the one or more central database locations and the two or more independent groups of databases the modified dataset(s) and the datasets corresponding to the modified dataset(s) so as to reflect the modifications made by the user.
13. The computer-implemented method of claim 12, wherein further the at least one master record is not accessible to any users of the two or more independent groups of databases.
14. The computer-implemented method of claim 12, wherein each at least one master record comprises a list of one or more datasets and each of the one or more datasets' profiles, wherein each dataset profile includes data fields comprising the dataset, the locations of the data fields across all of the independent groups of databases, a data format within each of the independent groups of databases, and the one or more central database locations for that dataset.
15. The computer-implemented method of claim 14, wherein the dataset profiles within each at least one master record are unique to that master record.
16. The computer-implemented method of claim 12, wherein the two or more independent groups of databases are each accessible through an application programming interface gateway.
17. The computer-implemented method of claim 16, wherein, when a user modifies one or more datasets in one of the independent groups of databases, the application programming interface gateway updates data fields in the modified one or more datasets identifying the modifying user and the name(s) of the modified one or more datasets.
18. The computer-implemented method of claim 12, wherein the incremental database record contains a history of data insertions, deletions, and edits, and a timestamp of the modifications.
19. The computer-implemented method of claim 12, wherein an application server monitor queries each incremental database record for modifications.
20. The computer-implemented method of claim 19, wherein the application server monitor generates a timestamped record of each query of each incremental database record, and wherein further the one or more central database locations are synchronized if the timestamp of the modifications in the incremental database has a later timestamp than the timestamped record of the last query by the application server monitor.
21 . The computer-implemented method of claim 12, wherein the incremental database record is automatically created when a user modifies one or more datasets in one of the independent groups of databases.
22. The computer-implemented method of claim 12, wherein further each database in the two or more independent groups of databases is assigned a priority group to prioritize synchronization of modifications when simultaneous modifications to corresponding datasets are made by separate users in the independent groups of databases.
23. The computer-implemented method of claim 12, wherein modification by a privileged user to one or more datasets in the one or more central databases effects synchronization of the corresponding datasets in the two or more independent groups of databases only after earlier modifications to the same one or more datasets by any users in the independent groups of databases.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202363608358P | 2023-12-11 | 2023-12-11 | |
| US63/608,358 | 2023-12-11 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025128636A1 true WO2025128636A1 (en) | 2025-06-19 |
Family
ID=96058354
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2024/059476 Pending WO2025128636A1 (en) | 2023-12-11 | 2024-12-11 | Computer-implemented system and method for maintaining data uniformity across independent databases |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025128636A1 (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20130339297A1 (en) * | 2012-06-18 | 2013-12-19 | Actifio, Inc. | System and method for efficient database record replication using different replication strategies based on the database records |
| US20170329834A1 (en) * | 2010-12-28 | 2017-11-16 | Amazon Technologies, Inc. | Data replication framework |
| US20190354628A1 (en) * | 2018-05-21 | 2019-11-21 | Pure Storage, Inc. | Asynchronous replication of synchronously replicated data |
-
2024
- 2024-12-11 WO PCT/US2024/059476 patent/WO2025128636A1/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20170329834A1 (en) * | 2010-12-28 | 2017-11-16 | Amazon Technologies, Inc. | Data replication framework |
| US20130339297A1 (en) * | 2012-06-18 | 2013-12-19 | Actifio, Inc. | System and method for efficient database record replication using different replication strategies based on the database records |
| US20190354628A1 (en) * | 2018-05-21 | 2019-11-21 | Pure Storage, Inc. | Asynchronous replication of synchronously replicated data |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11921894B2 (en) | Data processing systems for generating and populating a data inventory for processing data access requests | |
| US10972509B2 (en) | Data processing and scanning systems for generating and populating a data inventory | |
| US10949565B2 (en) | Data processing systems for generating and populating a data inventory | |
| US10437860B2 (en) | Data processing systems for generating and populating a data inventory | |
| US11782892B2 (en) | Method and system for migrating content between enterprise content management systems | |
| US10438020B2 (en) | Data processing systems for generating and populating a data inventory for processing data access requests | |
| US20140278663A1 (en) | Electronic discovery systems and workflow management method | |
| US10176462B2 (en) | Calendar repair assistant | |
| US8428995B2 (en) | System and method for automating customer-validated statement of work for a data storage environment | |
| EP2610762A1 (en) | Database version management system | |
| JP2011513870A (en) | Method and system for group data management and classification | |
| US20220138328A1 (en) | Validation of transaction ledger content using java script object notation schema definition | |
| US8612535B2 (en) | Repairing calendars with standard meeting messages | |
| CN102509211A (en) | Dynamic rebasing of persisted time information | |
| US10282700B2 (en) | Data processing systems for generating and populating a data inventory | |
| US10706015B2 (en) | System and method for managing a workflow for biomedical development | |
| WO2025128636A1 (en) | Computer-implemented system and method for maintaining data uniformity across independent databases | |
| US11526895B2 (en) | Method and system for implementing a CRM quote and order capture context service | |
| WO2019023509A1 (en) | Data processing and scanning systems for generating and populating a data inventory | |
| US20120078976A1 (en) | data organization tool and apparatus for remotely managing a meeting | |
| WO2019036651A1 (en) | Data processing systems and methods for populating and maintaining a centralized database of personal data | |
| Bosco et al. | CHOOSEY PROGRAM MANAGERS CHOOSE WEB-BASED COLLABORATING AND REPORTING TOOLS |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24904769 Country of ref document: EP Kind code of ref document: A1 |