Should business bogus deletion be designed like this?

should business pseudo-deletion be designed like this?

now in the era of big data, data is very important, so some data systems generally do not do physical deletion, deletion will also have backup, so it is necessary to design a set of pseudo-deletion logic, but it is shallow in learning. I don"t know much about this set of logic, so I"d like to ask you for advice!

here are two methods that I can search for, but I don"t think either of them is very good. Do you have any better methods?

1. In general, the system uses records plus DeleteAt fields to determine whether the data is deleted or not, but every query (single table, multi-table join check) needs to be judged by DeleteAt, which is complicated by a dimension on the system. And if the data has a unique index, you need to add the DeleteAt field to do the unique index, so as the amount of data increases, the index will consume a lot of space.

2. Another way is to add a table to delete the archived table, and the deleted records are deleted from the table and then migrated to the deleted archived table, so that the pseudo deletion is completed, but it will be troublesome if you do data recovery. (it is not clear how to delete the archived table. I would like to consult).

reply to the problem that only one status field is used to mark the status of the data in the answer

is marked with only one status field, which is defective when there is a unique key, and there are many scenarios of this unique key.
talk about the scenario:
the system requires that a user can only create one project. Of course, I can delete and rebuild the project.
there is only one status field to mark the project status, 1 enables 2 deletion

three fields of project table user_id project_id status form a unique key

user_id project_id status
U1 p1 2
U1 p2 1

well, the above is that after user U1 creates a project p1, delete and re-establish project p2, and then now, if the user also deletes p2, there will be a deletion exception, which violates the unique key constraint of the table, so the disadvantage of using only one status field to mark is exposed.

before, I also used status field marking in my project. If I delete and rebuild, I will first find the deleted record and re-enable it, but this is not correct, because the previously deleted project is associated with other records, or there are some data statistical records marked by id before. If you enable the previously deleted project record, it will be incorrect in terms of requirements.


both methods are good.

just delete the archive table. For example, if you have a content table, create a new content_deleted table. Whenever you delete data in the content table, add the data to the content_deleted table at the same time.

I don't have a better way. Pseudo deletion is a requirement in itself, and a requirement will increase the complexity of the program. I think it's a normal operation

.

both methods are used

for the first, I use the status field to indicate that when deleted, the record is updated with a field value of-1. This also has several benefits, such as-1 for deletion, 0 for deactivation, and 1 for enabling. More uses can be extended.


Business data generally do not delete data with sql like "delete from". You can simply set the status field of the data to unavailable.

of course, if the deletion method is used at the beginning of the design, some problems may be encountered, and some recovery operations need to be performed by DB. Generally speaking, it is recommended to use the status bit. Of course, there will be some problems of data redundancy (useless data is still left in the table).

delete the archive table, agree to go upstairs, and add transaction operations. However, it is not recommended to use it to recover data from the deleted archived table. The business design is complex and difficult to express clearly through simple logic. If you need to recover the data, it is recommended to use some of the database's own means.


We use the first three items


generally, if the amount of data is not very large, it is basically expressed by one field. For example, is_delete indicates whether the Recycle Bin 0 or 1 means that the data has been deleted. This is more convenient to deal with, of course, if necessary, you can also write to another table


the second method is relatively easy to achieve, delete, excluding fields that do not need to be considered, you can delete the columns of the data and the corresponding data to be saved to the deleted archived table in json format, and restore it as long as you update the table name and column name of json, this piece can be encapsulated into a method.
optimized, delete a large amount of data, can be combined with the first method of marking, regular batch processing, less encapsulated method


pseudo deletion is needed to recover, you can add a field del default to 0, the added goods are 0, if deleted, it will be changed to 1 (not really deleted). Of course, only data with a del of 0 is shown. Is to pseudo delete


in fact, I've seen two kinds of pros and cons
. The first problem is that all queries need to be changed, not to mention space consumption.
the second problem is to add an extra part of the recovered records to the data, such as origin_table, and how to restore various types of parsing and generating statements is more troublesome.

so I just had a flash of inspiration, and there is a way that I think is "perfect". I hope you can discuss it together

.

for example, we have an item table
when the original table is deleted directly every time it is pseudo-deleted. At the same time, we make an item_rmd anti-table for this table (don't add yourself up, which is very important) to item_rmd the deleted data intact (move). At this time, the ID (self-increment) of the two tables is exactly the original table
if we need to restore it. It is also very simple to query in any way against the original table without any additional logic to support

equivalent to a pizza, you can cut out a piece at random and insert it back intact, without affecting all existing business logic without any SQL changes.


1. Add a field is_del if it is 1, delete
2. Delete is_del=1 data regularly to avoid using the index occupied by is_del=1 data when querying data
3. Create a deleted data backup table, and the row package of is_del=1 contains the primary key to migrate to the backup table
?


generally, the first is fine. As for you say that the system has increased complexity, this is normal business logic, you can add a field called delete_at default null, delete time to write into it, check the time where delete_at is null , if the framework does not automatically deal with, you can encapsulate it in the model class.

Menu