Php big data update, incremental processing

1. Background: the company works as a domain name. At present, it wants to connect Godaddy"s one-price domain name, which has a large amount of data, and only gives an interface for all data, and the product requires that the domain name data be updated regularly (new consignment domain names are added, domain names with abnormal status are removed)
2. Question: after importing the complete data, how to update the data? the current method is to clear all the Godaddy data in the table, and then re-adjust the interface to import all the data. Is there any good way to achieve the incremental update of the data (the amount of data is about 600W)
do you have any great advice? thank you!

Mar.09,2021

how did you import php cli, from the interface or using a third-party swoole extension?

where is the database bottleneck? you don't even have 100 processes, the database can't be plugged in, and it takes about the same time.

can I import every 500000 pieces of data in a table?


personally, there is still something wrong with the method of emptying first and then filling it in. After emptying the data, there will be a period of vacuum before the data is filled.

does the batch API have the sorting function? you can use offset to get a new domain name, and then you need an API to query the domain name. This kind of API should be provided, otherwise your system will not be able to do it as a domain name.
determine the update time of the domain name when the user queries the domain name. If it is greater than a certain threshold, re-query the domain name separately and update it to your system.
is of course just a way of thinking. If the conditions are not met, think of another way.


the method of "emptying the data table and re-importing all the data" puts too much pressure on the database. After you call the API to obtain the new full data, do a traversal and compare it with the existing data one by one. If it is not the same, it means that the data has been changed. At this time, you can update the corresponding records in the database.

in fact, the best solution to this situation is to have the Godaddy side provide an incremental interface.


do another backup program, use the new full data to compare the existing data, and provide an incremental interface to the official site?

Menu