如何在c#或Java中使用replace或remove common来转换HTML

本文关键字:common replace remove HTML 转换 Java | 更新日期: 2023-09-27 18:08:05

    HTML 1  I m getting this in string ->
    S=        
           "<html>  <head> <link rel='stylesheet' type='text/css'
             href='http://www.taxmann.com/css/taxmannstyle.css' /> 
                 </head>  <body ><html>
                <body style='background-color:Black;font-size:30px;color:#fff;'>
        <div id='"digest'">'r'n   
                   <p class='"threedigest'">ST : Extended period of limitation 
                cannot be invoked for not paying tax if there was divergence 
        of opinion during relevant 
                period and 
                some judgments were in favour of assessee, 
                as there could be no suppression/wilful mis-statement
         by assessee</p>'r'n   
                 </div></body></html></body></html>"

注意:我得到这个HTML这是正确的

BUT    String HTML 2 ->
            "<html> 
                     <head> <link rel='stylesheet
                    ' type='text/css' href='http://www.taxmann.com/css/taxmannstyle.css' /> 
                     </head>  <body ><html><body style='background-color:Black;font-size:30px;color:#fff;'>
                    <html>'r'n<head>
                    <link href='http://www.taxmann.com/TaxmannWhatsnewService/Styles/style.css' rel='stylesheet' type='text/css' />
                    'r'n<title>Rs.560-crore tax evasion detected</title>'r'n<style type='"text/css'">
            'r'nbody
                    {font-family:Arial, Helvetica, sans-serif; font-size:12px; 
                line-height:18px;text-align:justify;}
                    'r'n.w100{width:100%;}'r'n.fl-l{float:left;}'r'n.ffla{font-family:Arial, 
                Helvetica, sans-serif;}
                    'r'n.fs18{font-size:18px;}'r'n.mart10{margin-top:10px;}'r'n.fcred{color:#c81616;}
                'r'n.tc{text-align:center;}'r'n.tu{text-transform:uppercase;}'r'n.lh18{line-height:18px;}'r'n</style>'r'n</head>'r'n<body>'r'n
                <div class='"w100 fl-l'">'r'n<div class='"w100 fl-l ffla fs18 mart10 fcred ttunderline tc tu'">
                    Rs.560-crore tax 
vasion detected</div>'r'n'r'n<div class='"w100 fl-l lh18 mart10'">
                The Central Excise Intelligence, 
Chennai Zone, has detected 164 cases involving excise
                 and service tax evasion of Rs.560 crore in 2012- 13.
     A total of 166 show cause notices
                 have been issued involving Rs.500 crore for 
    various central excise and service 
                tax cases during the year.
 – www.business-standard.com</div>'r'n'r'n
            </div>'r'n</body>'r'n
                    </html>'r'n</body>
    </html></body>
    </html>"

我想转换HTML 2格式相同的Html1格式,我尝试了很多,但无法做到。我也试图删除一些HTML内容,但它不起作用,我不知道如何转换Html2与Html1相同,即使我也试图使用Java删除此内容,但无法做到请帮助我!在任何编程语言中使用替换或删除命令

如何在c#或Java中使用replace或remove common来转换HTML

试着在你的代码中删除两个不需要的html标签,即服务器的响应有两个html标签,因为你没有得到适当的响应。尝试删除所有不需要的标签并对齐html代码

public class TestScriptClass {
public static void main(String[] args) {
    String inputValue=" ";
      inputValue =inputValue+"<html><head> <link rel='stylesheet' type='text/css' href='http://www.taxmann.com/css/taxmannstyle.css' />"+ 
                    "</head>  <body ><html><body style='background-color:Black;font-size:30px;color:#fff;'>"+ 
                    "<html>'r'n<head> <link href='http://www.taxmann.com/TaxmannWhatsnewService/Styles/style.css' rel='stylesheet' type='text/css' />"+ 
                    "'r'n<title>Rs.560-crore tax evasion detected</title>'r'n<style type='"text/css'">"+ 
                    "  'r'nbody{font-family:Arial, Helvetica, sans-serif; font-size:12px; "+ 
                    " line-height:18px;text-align:justify;} 'r'n.w100{width:100%;}'r'n.fl-l{float:left;}'r'n.ffla{font-family:Arial, "+ 
                    "Helvetica, sans-serif;} 'r'n.fs18{font-size:18px;}'r'n.mart10{margin-top:10px;}'r'n.fcred{color:#c81616;}"+ 
                    " 'r'n.tc{text-align:center;}'r'n.tu{text-transform:uppercase;}'r'n.lh18{line-height:18px;}'r'n</style>'r'n</head>'r'n<body>'r'n"+ 
                    "  <div class='"w100 fl-l'">'r'n<div class='"w100 fl-l ffla fs18 mart10 fcred ttunderline tc tu'">"+ 
                    "   Rs.560-crore tax "+ 
                    "vasion detected</div>'r'n'r'n<div class='"w100 fl-l lh18 mart10'">"+ 
                    " The Central Excise Intelligence, "+ 
                    "Chennai Zone, has detected 164 cases involving excise"+ 
                    "  and service tax evasion of Rs.560 crore in 2012- 13."+ 
                    "  A total of 166 show cause notices"+ 
                    "   have been issued involving Rs.500 crore for "+ 
                    "  various central excise and service "+ 
                    "  tax cases during the year."+ 
                    "– www.business-standard.com</div>'r'n'r'n"+ 
                    " </div>'r'n</body>'r'n"+ 
                    "   </html>'r'n</body>"+ 
                    "  </html></body>"+ 
                    "   </html>";
      String resultValue= inputValue.replace("<html><head> <link rel='stylesheet' type='text/css' href='http://www.taxmann.com/css/taxmannstyle.css' /></head>  <body ><html>", " <html><head> <link rel='stylesheet' type='text/css' href='http://www.taxmann.com/css/taxmannstyle.css' />");
      System.out.println(resultValue);       
}
}