A question of binary classification in machine learning

requirements
currently, 10 to 20 high-quality articles are manually selected from 1000 to 2000 crawled articles every day and pushed to customers
want to be changed to machine automatic screening

would like to ask how to use machine learning to achieve this requirement?

The

sample looks like this:
whether the content length of platform title content pushes (label)

Machine learning

Mar.24,2021

how big is the sample? Do you have 100W? If the sample is very large, you can learn directly and deeply. If it is not too large, it can be logically regressed directly according to the sample you give. However, if you want to extract your own features, if the feature is too small, it may not be recommended accurately, and if the feature is too large, it may be over-fitted. You still have to try it yourself.

think carefully in terms of feature engineering first. Grab more dimensions in the process of grabbing news. Then the focus is on high quality how to judge.

such as the number of times, the number of comments, the number of mutual reviews, the length, whether advertising or not, may be the factors that affect the "quality" of the article.
then pass the above data into the model (LR/DT/SVM) as input, and output the result.

in addition, if you don't want to do feature engineering, you can consider deep learning. Each piece of news is word embedding in the form of a string of long text as a sequence into the neural network, and the output is divided into two categories of whether the quality is high or not. Word embedding can be trained in advance or at the same time in the training model.

Previous: How does vue apply a skeleton screen to multiple routes?

Next: Embed loads the swf completion event.

When tensorflow-gpu uses GPU, its efficiency is the same as that of CPU.
The setting uses GPU, but there is no change in computing speed, and gpu utilization is extremely low. here is part of the code: < hr > from tensorflow.python.framework import ops ops.reset_default_graph() config = tf.ConfigProto() config.gpu_optio...

Tensorflow gpu cpu machine-learning

Feb.28,2021
Error occurred when converting Pytorch model to GPU model
because there is no PyTorch tag at present, so TensorFlow is used. Would you like to add this tag to those with high reputation? Thanks tags () recently, I m going to use the VGG model to train the data, and then I want to use GPU, but I always repor...

Tensorflow Deep-Learning artificial-Intelligence Machine-Learning python

Feb.28,2021
The problem of non-Convergence of Tensorflow Multivariate Linear regression parameters
when using Tensorflow for multiple linear regression, we encounter the problem of parameter non-convergence. The problem lies in the choice of optimization methods: if you use tf.train.AdamOptimizer , the parameters will converge and the loss function ...

Machine-learning linear-regression tensorflow

Feb.28,2021
Graphlab SFrame.show () problem
after importing data, it always reports an error after running show. I can t find a solution. Have you ever encountered it before? AttributeErrorTraceback (most recent call last) <ipython-input-14-f3e970a4cdcd> in <module>() ----> 1 gra...

Machine learning

Mar.03,2021
How to draw classified pictures in perceptron with matplotlib
question: recently I was learning the perceptron model and wanted to draw the following picture, but found that the + - sign in the following picture could not be drawn. drawing code from elsewhere, where Perceptron is a class implemented i...

Matplotlib machine-learning python3.x

Mar.03,2021
How to input multiple features by pytorch
I just started to learn about machine learning. When I tried to do some small things, I found that there was only one feature entered in the pytorch content on the Internet, which may be because my keywords were wrong. If I want to use pytorch to create...

Neural-network machine-learning

Mar.04,2021
Call Spark mllib linear regression to print weights and other coefficients are all NaN
use spark mllib linear regression to do traffic forecast printing training, weight and other coefficients are all NaN data format: 520221 | 0009 | 0009 | 292 | 000541875150 | 2018 | 04 | 18 | 11 | 3 | 137 520626 | 0038 | 0038 | 520626 | 2030300010...

Linear-regression spark machine-learning

Mar.06,2021
Detectron trained his picture dataset to appear TypeError: string indices must be integers.
I would like to ask why my picture is named 000001.jpg 000002.jpg. Why would I report this error after training?...

Picture-processing machine-learning

Mar.11,2021
Machine learning identification verification code
I would like to ask, how did this ttf come from? how do you get the v3 v4 v5 reras below? ...

Machine learning

Mar.11,2021
Can vmware open a virtual domain name?
IP: 192.168.0.88 after the vmware virtual machine is installed on this machine found that the virtual machine can only be accessed through IP how to open multiple virtual domain names in the virtual machine and then access this domain name through...

Virtual-machine linux vmware

Mar.11,2021
If I have two articles, how to calculate their similarity? What are the specific ideas and mature methods?
RT, if I have two articles, how can I calculate their similarity? What are the specific ideas and mature methods? ...

Similarity-detection natural-language-processing nlp machine-learning article-extraction

Mar.11,2021
The following problems occurred when installing keras
...

Machine learning

Mar.11,2021
Detectron trained himself to show Can't find blob: gpu_0/bbox_pred in the picture.
what is the cause of this ?...

Picture-processing machine-learning

Mar.12,2021
Calculation of cosine similarity in high-dimensional space by java
is there any open source library that can do that ...

Machine-learning nlp java

Mar.13,2021
The difference between pattern recognition and machine learning
what is the difference between pattern recognition and machine learning? How to accurately judge whether an algorithm is pattern recognition or machine learning ...

Machine learning

Mar.13,2021
On the problem of drawing with matplotlib under ubuntu
first of all, import tkinter and then the error is as follows: Traceback (most recent call last): File "<stdin>", line 1, in <module> ModuleNotFoundError: No module named tkinter but I m already sudo apt-get install pyth...

Python linux ubuntu machine-learning

Mar.14,2021
How does the neural network set the number of hidden layers and nodes? the more it is, the more accurate it will be.
I have seen several empirical formulas about the number of nodes in the hidden layer 1, for example, I am 40 input nodes and one output node. According to the empirical formula, if the hidden layer is single-layer hidden layer, the number of hidden ...

Deep-learning neural-network machine-learning

Mar.18,2021
The rookie asked: how to understand the training in machine learning?
what is the meaning of the "training algorithm " in the figure below? For example, the KNN algorithm mentioned below is not a function, something determined by routines? Why train? ...

Model knn algorithm machine-learning

Mar.18,2021
The difference between BP Neural Network and cnn dnn
difference between BP neural network and cnn dnn ...

Machine learning

Mar.22,2021
How to use the vpn network of virtual machines under Mac?
the vpn tool provided by the company is only available in windows, so I want to use Parallels Desktop software to build a win7 virtual machine, connect the win7 virtual machine to the company s vpn, and then let mac connect to the vpn network in the vi...

Virtual machine

Mar.22,2021

MySQL Query : SELECT * FROM `codeshelper`.`v9_news` WHERE status=99 AND catid='6' ORDER BY rand() LIMIT 5
MySQL Error : Disk full (/tmp/#sql-temptable-64f5-482541a-41aa.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
MySQL Errno : 1021
Message : Disk full (/tmp/#sql-temptable-64f5-482541a-41aa.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
Need Help?