找回密码
 立即注册

QQ登录

只需一步,快速开始

搜索本站精品资源

首页 教程频道 php教程 查看内容

经过URL抓取网页的TITLE,有些网站抓不到,方法愚笨,求指点

作者:模板之家 2020-3-29 12:34 105人关注

通过URL抓取网页的TITLE,有些网站抓不到,方法愚笨,求指点。本帖最后由u012716911于2013-11-0411:25:29编辑代码是我自己这样想着写的,不知道还有没有更好的方法。请各位给些指点有些网站可以抓到,如百度,有些网 ...

通过URL抓取网页的TITLE,有些网站抓不到,方法愚笨,求指点。
本帖最后由 u012716911 于 2013-11-04 11:25:29 编辑
代码是我自己这样想着写的,不知道还有没有更好的方法。请各位给些指点
有些网站可以抓到,如百度,有些网站就抓不到,比如太平洋汽车的首页。


public function set_title()
{
// 获取进来URL
$url = $_POST['url'];
// $url = "www.pcauto.com.cn"; 抓不到!
//一连串的curl设置
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_HEADER,0);
curl_setopt($ch,CURLOPT_ENCODING,'gzip');
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
$content_source = curl_exec($ch);
curl_close($ch);

//获取抓到内容的编码格式

$encode = mb_detect_encoding($content_source, array('GB2312','GBK','UTF-8','ASCII')); 

//转码
$content_source = iconv($encode, 'utf-8//IGNORE',$content_source);

//截取<br /> if(preg_match("/<title>(.*?)<\/title>/i",$content_source,$title))<br /> {<br /> echo $title[1];<br /> }<br /> else<br /> {<br /> echo '拉取标题失败';<br /> }<br /> }<br /> </pre></span> <div class='topic-extra-info'> <div class="tag"> <span>curl</span> <span>抓取</span> <span>标题</span> </div> <div class='social-share' > <span class='prompt'>分享到:</span> <span class='social-share-buttons' data-title='【通过URL抓取网页的TITLE,有些网站抓不到,方法愚笨,求指点。】代码是我自己这样想着写的,不知道还有没有更好的方法。请各位给些指点有些网站可以抓到,如百度,有些网站就抓不到,比如太平洋汽车的首页。public?function?set_t...' data-pics=''></span> </div> </div><br><font color='#FF8000'>------解决方案--------------------</font><br>问题出在正则匹配那里,你加个 s 修正符就好了<br /> if(preg_match("/<title>(.*?)<\/title>/is",$content_source,$title))<br /> <br /> s  如果设定了此修正符,模式中的圆点元字符(.)<strong><span style="color: #FF0000;">匹配所有的字符,包括换行符。没有此设定的话,则不包括换行符</span></strong>。 </td> </tr> </table> <div class="o cl ptm pbm" style="display: none;"> <a href="misc.php?mod=invite&action=article&id=12307" id="a_invite" onclick="showWindow('invite', this.href, 'get', 0);" class="oshr oivt" style=" display:none;">邀请</a> </div> <!--[diy=diycontentbottom]--><div id="diycontentbottom" class="area"></div><!--[/diy]--> <!--[diy=diycontentclickbottom]--><div id="diycontentclickbottom" class="area"></div><!--[/diy]--> </div> <div id="click_div" class="mbm"> <table cellpadding="0" cellspacing="0" class="atd"> <tr><td> <a href="home.php?mod=spacecp&ac=click&op=add&clickid=1&idtype=aid&id=12307&hash=686def210e7228c16f60a89aa0a617b2&handlekey=clickhandle" id="click_aid_12307_1" onclick="showWindow(this.id, this.href);doane(event);"> <img src="static/image/click/static/image/click/luguo.gif" alt="" /><br />路过</a> </td> <td> <a href="home.php?mod=spacecp&ac=click&op=add&clickid=2&idtype=aid&id=12307&hash=686def210e7228c16f60a89aa0a617b2&handlekey=clickhandle" id="click_aid_12307_2" onclick="showWindow(this.id, this.href);doane(event);"> <img src="static/image/click/static/image/click/leiren.gif" alt="" /><br />雷人</a> </td> <td> <a href="home.php?mod=spacecp&ac=click&op=add&clickid=3&idtype=aid&id=12307&hash=686def210e7228c16f60a89aa0a617b2&handlekey=clickhandle" id="click_aid_12307_3" onclick="showWindow(this.id, this.href);doane(event);"> <img src="static/image/click/static/image/click/woshou.gif" alt="" /><br />握手</a> </td> <td> <a href="home.php?mod=spacecp&ac=click&op=add&clickid=4&idtype=aid&id=12307&hash=686def210e7228c16f60a89aa0a617b2&handlekey=clickhandle" id="click_aid_12307_4" onclick="showWindow(this.id, this.href);doane(event);"> <img src="static/image/click/static/image/click/xianhua.gif" alt="" /><br />鲜花</a> </td> <td> <a href="home.php?mod=spacecp&ac=click&op=add&clickid=5&idtype=aid&id=12307&hash=686def210e7228c16f60a89aa0a617b2&handlekey=clickhandle" id="click_aid_12307_5" onclick="showWindow(this.id, this.href);doane(event);"> <img src="static/image/click/static/image/click/jidan.gif" alt="" /><br />鸡蛋</a> </td> </tr> </table> <script type="text/javascript"> function errorhandle_clickhandle(message, values) { if(values['id']) { showCreditPrompt(); show_click(values['idtype'], values['id'], values['clickid']); } } </script> </div> <!--[diy=diycontentrelatetop]--><div id="diycontentrelatetop" class="area"></div><!--[/diy]--> <div class="contacts cl"> <span class="pipe" style="float: left;"></span>原作者: 互联网 <span class="pipe" style="float: left;"></span>来自: 网络收集 </div> </div> <!--[diy=diycontentrelate]--><div id="diycontentrelate" class="area"></div><!--[/diy]--> <div class="cl" style="padding: 8px 20px; margin: 0 0 20px 0; font-size: 14px; background: #f4f4f4;"> <div class="cl" style="margin: 8px 0;"><em>上一篇:<a href="article-12306-1.html">Request-URITooLarge如何解决</a></em></div><div class="cl" style="margin: 8px 0;"><em>下一篇:<a href="article-12308-1.html">PHP语言,未赋值的变量?该怎么处理</a></em></div></div> </div> </div> <!--[diy=diycontentcomment]--><div id="diycontentcomment" class="area"></div><!--[/diy]--> </div> <div id="comment" class="bm cl"> <div class="bm_h cl"> <h3>全部回复(0)</h3> </div> <div id="comment_ul" style="padding: 0 30px 20px 30px;"> <ul> </ul> </div> </div> <div class="reply_box cl" style="padding: 24px 30px; box-shadow: 0 0 2px 0 rgba(98,124,153,.1); border-radius: 4px; margin-top: 10px;"> <div class="comment_box cl"> <form id="cform" name="cform" action="portal.php?mod=portalcp&ac=comment" method="post" autocomplete="off"> <div class="tedt" id="tedt"> <div class="area"> <textarea name="message" rows="3" class="pt" id="message" placeholder="登录后才能发表内容及参与互动" onkeydown="ctrlEnter(event, 'commentsubmit_btn');"></textarea> </div> </div> </form> </div> </div> <script type="text/javascript"> jQuery(function(){ jQuery("#tedt .pt").focus(function(){ jQuery(this).addClass("bgchange"); }).blur(function(){ jQuery(this).removeClass("bgchange"); }); }); </script> </div> <div class="sd pph" style="min-height: 100px; padding: 0; box-shadow: none; background: none;"> <div class="whole_Box" style="margin-bottom: 20px; padding-bottom: 10px; display: none;"> <h3><span class="span-mark span-mark2"></span><b>相关分类</b></h3> <div class="portal_sort Framebox2 cl" style="width: 280px; margin: 5px 0 0 20px;"> <ul class="cl"><li><a href="https://www.mb5.com.cn/articles/miniprogram/">小程序开发</a></li> <li><a href="https://www.mb5.com.cn/articles/php/">php教程</a></li> <li><a href="https://www.mb5.com.cn/articles/mysql/">mysql教程</a></li> <li><a href="https://www.mb5.com.cn/articles/js/">js教程</a></li> <li><a href="https://www.mb5.com.cn/articles/server/">服务器运维</a></li> <li><a href="https://www.mb5.com.cn/articles/experience/">经验杂谈</a></li> <li><a href="https://www.mb5.com.cn/articles/linux/">linux命令</a></li> </ul> </div> </div> <!--[diy=diy6]--><div id="diy6" class="area"><div id="frameSmitcC" class="frame move-span cl frame-1"><div id="frameSmitcC_left" class="column frame-1-c"><div id="frameSmitcC_left_temp" class="move-span temp"></div><div id="portal_block_99" class="block move-span"><div id="portal_block_99_content" class="dxb_bc"></div></div></div></div><div id="framew4QIQE" class="frame move-span cl frame-1"><div id="framew4QIQE_left" class="column frame-1-c"><div id="framew4QIQE_left_temp" class="move-span temp"></div><div id="portal_block_100" class="block move-span"><div id="portal_block_100_content" class="dxb_bc"><div data-v-5ad8c7c4="" data-v-6f02208c="" data-v-245b551e="" class="margin-bottom-12px f-card dow" data-v-18340b4e=""> <div data-v-5ad8c7c4="" class="f-card-header"> <div data-v-5ad8c7c4="" class="title"> <svg data-v-5ad8c7c4="" aria-hidden="true" width="14px" height="14px" class="title-icon"> <use xlink:href="#icon-grouping" style="pointer-events: none;"></use> </svg> <span data-v-5ad8c7c4="" class="title-text">推荐阅读</span></div> </div> <div data-v-5ad8c7c4="" class="f-card-body"> <div data-v-6f02208c="" data-v-5ad8c7c4="" class="essence-content"> <ul data-v-6f02208c="" data-v-5ad8c7c4="" class="essence-list"><li data-v-6f02208c="" data-v-5ad8c7c4="" class="item"> <div data-v-6f02208c="" data-v-5ad8c7c4="" class="item-content"> <div data-v-6f02208c="" data-v-5ad8c7c4="" class="disc"> <div data-v-6f02208c="" data-v-5ad8c7c4="" class="gray-dot"></div> </div> <a href="article-34527-1.html" data-v-6f02208c="" class="title" target="_blank" data-v-5ad8c7c4="">mysql分页操作</a></div> </li><li data-v-6f02208c="" data-v-5ad8c7c4="" class="item"> <div data-v-6f02208c="" data-v-5ad8c7c4="" class="item-content"> <div data-v-6f02208c="" data-v-5ad8c7c4="" class="disc"> <div data-v-6f02208c="" data-v-5ad8c7c4="" class="gray-dot"></div> </div> <a href="article-34528-1.html" data-v-6f02208c="" class="title" target="_blank" data-v-5ad8c7c4="">update是什么关键字</a></div> </li><li data-v-6f02208c="" data-v-5ad8c7c4="" class="item"> <div data-v-6f02208c="" data-v-5ad8c7c4="" class="item-content"> <div data-v-6f02208c="" data-v-5ad8c7c4="" class="disc"> <div data-v-6f02208c="" data-v-5ad8c7c4="" class="gray-dot"></div> </div> <a href="article-34529-1.html" data-v-6f02208c="" class="title" target="_blank" data-v-5ad8c7c4="">MySQL错误1171如何解决</a></div> </li><li data-v-6f02208c="" data-v-5ad8c7c4="" class="item"> <div data-v-6f02208c="" data-v-5ad8c7c4="" class="item-content"> <div data-v-6f02208c="" data-v-5ad8c7c4="" class="disc"> <div data-v-6f02208c="" data-v-5ad8c7c4="" class="gray-dot"></div> </div> <a href="article-34530-1.html" data-v-6f02208c="" class="title" target="_blank" data-v-5ad8c7c4="">深入理解MySQL数据类型:探索基本数据类型的细节和限制</a></div> </li><li data-v-6f02208c="" data-v-5ad8c7c4="" class="item"> <div data-v-6f02208c="" data-v-5ad8c7c4="" class="item-content"> <div data-v-6f02208c="" data-v-5ad8c7c4="" class="disc"> <div data-v-6f02208c="" data-v-5ad8c7c4="" class="gray-dot"></div> </div> <a href="article-34531-1.html" data-v-6f02208c="" class="title" target="_blank" data-v-5ad8c7c4="">快速掌握MySQL常用数据类型:常见数据类型及其应用场景一览</a></div> </li><li data-v-6f02208c="" data-v-5ad8c7c4="" class="item"> <div data-v-6f02208c="" data-v-5ad8c7c4="" class="item-content"> <div data-v-6f02208c="" data-v-5ad8c7c4="" class="disc"> <div data-v-6f02208c="" data-v-5ad8c7c4="" class="gray-dot"></div> </div> <a href="article-34532-1.html" data-v-6f02208c="" class="title" target="_blank" data-v-5ad8c7c4="">了解MySQL的主要数据类型:熟悉常用的数据类型有哪些</a></div> </li></ul> </div> </div> </div></div></div></div></div></div><!--[/diy]--> <div id="recommendArticle"> <!--[diy=diy7]--><div id="diy7" class="area"></div><!--[/diy]--> </div> </div> </div> <div class="wp mtn"> <!--[diy=diy3]--><div id="diy3" class="area"></div><!--[/diy]--> </div> <input type="hidden" id="portalview" value="1"> <script type="text/javascript"> jQuery(function() { jQuery("span").click(function() { var thisEle = jQuery("#article_content").css("font-size"); var textFontSize = parseFloat(thisEle, 10); var unit = thisEle.slice( - 2); var cName = jQuery(this).attr("class"); if (cName == "bigger") { if (textFontSize <= 22) { textFontSize += 2; } } else if (cName == "smaller") { if (textFontSize >= 12) { textFontSize -= 2; } } jQuery("#article_content").css("font-size", textFontSize + unit); }); jQuery("#scbar_mod").val('article'); }); </script> </div> </div> <link rel="stylesheet" type="text/css" id="time_diy" href="template/www_mb5_com_cn/portal/font/font-awesome.min.css?oP8" /> <style type="text/css"> .header1 .menu ul li i { color: #757575; font-size: 16px; margin-right: 8px} .header1 .menu { position: fixed; left: -300px; transition: all 0.5s ease 0s; height: 100%; top: 0; padding-top: 0; border: 0 !important; background: #FFFFFF; width: 300px; max-width: none !important; z-index: 10000; } .header1 .menu.on { left: 0 !important; } .header1 .menu ul li { float: left; margin: 0; position: relative; height: auto !important; font-size: 16px; width: 100%; line-height: 50px; border-bottom: 1px dashed #eee; padding: 0; text-align: left; } .header1 .menu ul li span { display: none} .header1 .menu ul li a { display: inline-block; width: 100%; font-size: 16px !important; float: none !important; padding: 0 0 0 20px; height: 50px; line-height: 50px; color: #555555; } .menu-header { line-height: 62px; background: #555863; padding: 30px 12px; } .radius10 { -webkit-border-radius: 10%; -moz-border-radius: 10%; -o-border-radius: 10%; -ms-border-radius: 10%; } .bottombar { display: none; width: 100%; position: fixed; bottom: 0; left: 0; height: 40px; padding: 6px 0; border-top: 1px solid #f2f2f2; z-index: 999 } .bg-white { background-color: #fff; } .bottombar ul { width: 100%; } .bottombar ul li { width: 20%; float: left; text-align: center; } .bottombar ul li a { position: relative; } .bottombar ul li a { color: #9e9e9e; } .bottombar ul li.active a { color: #F25746; } .bottombar ul li p { font-size: 12px; } .bottombar ul li .fa { display: inline-block; font: normal normal normal 14px/1 FontAwesome; font-size: inherit; text-rendering: auto; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } .bottombar ul li .fa { font-size: 20px; } .bottombar .dz_postx1 { display: inline-block; background: url(template/www_mb5_com_cn/images/post.svg) no-repeat center center #FFFFFF; background-size: 44px 44px; width: 44px; height: 44px; margin-top: -10px; border-radius: 50%; padding: 5px; box-shadow: 0px -2px 1px rgb(0 0 0 / 8%); } @media (max-width: 800px) { .bottombar { display: block} } </style> <div class="bottombar bg-white"> <ul> <li id="bottom1" class=""><a href="portal.php"><i class="fa fa-home" aria-hidden="true"></i><p>首页</p></a></li> <li id="bottom2"><a href="#"><i class="fa fa-th-large"></i><p>分类</p></a></li> <li><div class="dz_postx1"><a href="member.php?mod=logging&action=login" style="display: block; width: 44px; height: 44px;"></a></div></li> <li id="bottom4"><a href="#"><i class="fa fa-wpforms"></i><p>索引</p></a></li> <li id="bottom5" class=""><a href="member.php?mod=logging&action=login"><i class="fa fa-user"></i><p>我的</p></a></li> </ul> </div> <div class="footer cl" data-v-0bacf66c="" style="padding: 20px 0 0 0; margin: 30px 0 0 0;"> <div class="section1x cl"> <div class="wp cl" data-v-0bacf66c=""> <div class="left" data-v-0bacf66c="" style="float: left;"> <div class="beian" data-v-0bacf66c="">Copyright © <a href="http://www.comsenz.com" target="_blank">Comsenz Inc.</a> Powered by <a href="https://www.discuz.vip" target="_blank">Discuz!</a>X3.5 <script type="text/javascript" src="//js.users.51.la/21339503.js"></script> <script charset="UTF-8" id="LA_COLLECT" src="//sdk.51.la/js-sdk-pro.min.js"></script> <script>LA.init({id: "JiKN2zqEInH9gV8O",ck: "JiKN2zqEInH9gV8O"})</script> <a href="https://beian.miit.gov.cn/" target="_blank">蜀ICP备12013697号</a> </div> </div> <div class="right" data-v-0bacf66c="" style="float: right;"> <div data-v-0bacf66c="" class="cl"><span data-v-0bacf66c=""> 微信:xu08290201 <a href="tencent://message/?uin=75283535&Site=site&Menu=yes">QQ:75283535</a> </span></div> </div> </div> </div> </div> <style type="text/css"> .ivu-tooltip, .ivu-tooltip-rel { display: inline-block; } .backToTop[data-v-ccfc19b0] { width: 52px; height: 52px; border-radius: 5px; position: fixed; right: 27px; bottom: 128px; } .backToTop[data-v-ccfc19b0], .backToTop[data-v-ccfc19b0]:hover { background-image: url(template/www_mb5_com_cn/images/up1.cc51cb3.svg); } .ivu-tooltip-rel { position: relative; width: inherit; } .backToTop[data-v-ccfc19b0] .ivu-tooltip-rel { height: 52px; } .pointer[data-v-ccfc19b0] { cursor: pointer; } .backToTop[data-v-ccfc19b0] { width: 52px; height: 52px; border-radius: 5px; position: fixed; right: 27px; bottom: 128px; cursor: pointer } .ivu-tooltip-popper { display: block !important; visibility: visible; font-size: 14px; line-height: 1.5; position: absolute; opacity: 0; z-index: 1060; transition: all 0.8s ease 0s } .backToTop:hover .ivu-tooltip-popper { display: block !important; opacity: 1 } .ivu-tooltip-popper[x-placement^=left] { padding: 0 8px 0 5px; } .ivu-tooltip-arrow { position: absolute; width: 0; height: 0; border-color: transparent; border-style: solid; } .ivu-tooltip-popper[x-placement^=left] .ivu-tooltip-arrow { right: 3px; border-width: 5px 0 5px 5px; border-left-color: rgba(70,76,91,.9); } .ivu-tooltip-popper[x-placement=left] .ivu-tooltip-arrow { top: 50%; margin-top: -5px; } .ivu-tooltip-inner { max-width: 250px; height: 34px; line-height: 34px; padding: 0 12px; color: #fff; text-align: left; text-decoration: none; background-color: rgba(70,76,91,.9); border-radius: 4px; box-shadow: 0 1px 6px rgb(0 0 0 / 20%); white-space: nowrap; } </style> <div data-v-ccfc19b0="" class="backToTop ivu-tooltip" style=""><div class="ivu-tooltip-rel"><div data-v-ccfc19b0="" class="backToTop pointer"></div></div> <div class="ivu-tooltip-popper ivu-tooltip-dark" style="position: absolute; will-change: top, left; top: 8px; left: -93px; display: none;" x-placement="left"><div class="ivu-tooltip-content"><div class="ivu-tooltip-arrow"></div> <div class="ivu-tooltip-inner">返回顶部</div></div></div></div> <script type="text/javascript"> jQuery.noConflict(); jQuery(function(){ //首先将#back-to-top隐藏 jQuery(".backToTop").hide(); //当滚动条的位置处于距顶部100像素以下时,跳转链接出现,否则消失 jQuery(function () { jQuery(window).scroll(function(){ if (jQuery(window).scrollTop()>100){ jQuery(".backToTop").fadeIn(); } else { jQuery(".backToTop").fadeOut(); } }); //当点击跳转链接后,回到页面顶部位置 jQuery(".backToTop").click(function(){ jQuery('body,html').animate({scrollTop:0},500); return false; }); }); }); </script> </div> </div> </body></html>