Using PHP to collect a large amount of data, how to improve access performance?

problem description:
use php to collect data from foreign websites for testing. The data is saved in txt format and stored on the server. When you need to access it, use php to read the data in txt and return it to the user.
when the server is running for a period of time, when accessing the resources on the server through the URL, the opening speed is quite slow, not at first.
Note: the
URL is the same as the previous website, except for caching our own top-level domain name.
data is triggered. The data is remotely collected locally and returned to the user the first time, and then read locally the second time.

question:
1. Even when the amount of data is too large, it can be opened in a second
2. Where is the bottleneck? Memory is still IO
3. After using it for a period of time, the opening speed will be very slow, and it will be much better after restarting the server
4. If you save the data to the MySQL database, it should be more stuck. At present, it is plain text, but when you access it, use php to read the file through a simple path matching algorithm and return it to the user
5. Are there any related books to learn?
6. Can it be mitigated by upgrading the PHP version to php7,?

Php
Mar.06,2021

is the file read by


cut the file, and the file is stored by date


"through a simple path matching algorithm, use php to read the file and return it to the user"-whether it can be directly made into static


by url rewriting

is the PHP-FPM memory leak causing memory exhaustion
whether this comment is not removed, the number of requests accepted by each php-fpm process is not set
if the process is not set, it will continue to accumulate memory (a request to exit will leave some information in the process, resulting in memory depletion)

;pm.max_requests = 500

there are too many small files, which must be stuck. On the contrary, I don't think it will be too stuck to put the database.


  • performance bottlenecks may remind you of IO, bandwidth, memory, concurrency issues
  • it is best to use MQ to collect data. Avoid server crashes caused by unknown concurrency
  • for those who read text directly, you can consider accessing text files directly (similar to reading js and css static resources; reducing additional performance consumption without php,); and consider increasing CDN
Menu