How should this program use regular substitution?

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <meta http-equiv="Cache-Control" content="no-cache"/>
    <meta id="viewport" name="viewport" content="width=device-width,initial-scale=1.0,minimum-scale=1.0, maximum-scale=2.0" />
    <link rel="icon" sizes="any" mask href="https://h5.sinaimg.cn/upload/2015/05/15/28/WeiboLogoCh.svg" color="black">
    
    

one of my texts looks like this. I want to delete the first
, so I wrote

in python.
sss = re.sub("<?xml version="1.0" encoding="UTF-8"?>","",html)
print(sss)

but I found it didn"t work? How should I write it?

Mar.14,2021

first of all, you don't need regular expressions here, just replace them with strings:

sss = html.replace('<?xml version="1.0" encoding="UTF-8"?>', '')
print(sss)

secondly, even if it is replaced by regularity,? There is a special meaning in the regularity, which needs to be escaped:

sss = re.sub('<\?xml version="1.0" encoding="UTF-8"\?>', '', html)
print(sss)

you can take a look at more information about regularization.


because you want to replace the double quotation marks, but the single quotation marks written in the rule
in addition, the question mark needs to be escaped

is it a novice to python?

   

Menu