如何解决从python beautifulsoup中的html中提取json
使用BeautifulSoup。
from bs4 import BeautifulSoup
import json
s = """So I got Now a variable String with this text ins variable
<div class="header-product js-header-product" data-product='{
"sku": "218009200",
"fullTitle": "iPhone 7 Apple 32GB Preto Matte 4G Tela 4.7”Retina - Câm. 12MP + Selfie 7MP iOS 11 Proc. Chip A10",
"baseUrl": "https://www.myurl.com.br",
"variationPath": "iphone-7-apple-32gb-preto-matte-4g-tela-4-7-retina-cam-12mp-selfie-7mp-ios-11-proc-chip-a10/p/2180092/te/iph7/",
"imageUrl": "https://a-static.mlcdn.com.br/{w}x{h}/iphone-7-apple-32gb-preto-matte-4g-tela-4-7-retina-cam-12mp-selfie-7mp-ios-11-proc-chip-a10/myurl/218009200/f06f03c5ea2ba95deaa3e55e5e0e687e.jpg",
"urlSubcategories": "https://www.myurl.com.br/iphone-7-e-iphone-7-plus/celulares-e-smartphones/s/te/iph7/",
"quantitySellers": 1,
"categoryId": "te",
"serviceUrl": "/produto/garantia-plus/?product=218009200&marketplaceSellerId=myrul&productdiscountPrice=3199.00&productCashPrice=2879.10&productQuantity=10",
"title": "iPhone 7 Apple 32GB Preto Matte 4G Tela 4....",
"serviceUrl": "/produto/garantia-plus/?product=218009200&marketplaceSellerId=myurl&productdiscountPrice=3199.00&productCashPrice=2879.10&productQuantity=10",
"bestPriceTemplate": " 2.879,10",
"installmentQuantity": "10",
"buyTogetherImage": "https://a-static.mlcdn.com.br/195x145/iphone-7-apple-32gb-preto-matte-4g-tela-4-7-retina-cam-12mp-selfie-7mp-ios-11-proc-chip-a10/myurl/218009200/f06f03c5ea2ba95deaa3e55e5e0e687e.jpg",
"thumbailBuyTogether": "https://a-static.mlcdn.com.br/70x90/iphone-7-apple-32gb-preto-matte-4g-tela-4-7-retina-cam-12mp-selfie-7mp-ios-11-proc-chip-a10/myurl/218009200/f06f03c5ea2ba95deaa3e55e5e0e687e.jpg",
"list_price_price_parcel_cash_price": "list_price_price_parcel_cash_price.html",
"listPrice": " 3.499,90",
"installmentAmount": " 319,90",
"priceTemplate": " 3.199,00",
"seller": "myurl",
"attributes": [{"label":"Cor","value":"Preto Matte","type":"color","id":"218009200","image":"https:\/\/a-static.mlcdn.com.br\/{w}x{h}\/iphone-7-apple-32gb-preto-matte-4g-tela-4-7-retina-cam-12mp-selfie-7mp-ios-11-proc-chip-a10\/myurl\/218009200\/f06f03c5ea2ba95deaa3e55e5e0e687e.jpg","url":"\/iphone-7-apple-32gb-preto-matte-4g-tela-4-7-retina-cam-12mp-selfie-7mp-ios-11-proc-chip-a10\/p\/218009200\/te\/iph7\/","selected":true,"is_delivery_available":true}],
"variations": ["Preto Matte"] }'> <h1 class="header-product__title" itemprop="name">iPhone 7 Apple 32GB Preto Matte 4G Tela 4.7”Retina - Câm. 12MP + Selfie 7MP iOS 11 Proc. Chip A10</h1> <small class="header-product__code">Código 218009200 <span class="header-product__separator"></span> <a class="header-product__text-interation js-floater-menu-link" href="#anchor-description">Ver descrição completa</a> <span class="header-product__separator"></span> <a class="header-product__text-interation" href="https://www.myurl.com.br/marcas/apple/" itemscope="" itemtype="http://schema.org/Brand"> <span itemprop="name">Apple</span> </a> <Meta content="sku:218009200" itemprop="identifier"/> <Meta content="http://schema.org/NewCondition" itemprop="itemCondition"/> </small> </div>
"""
soup = BeautifulSoup(s, "html.parser")
element = soup.find("div", class_="header-product js-header-product")
print element.attrs["data-product"]
jsonData = json.loads(element.attrs["data-product"]) #Convert to JSON Object.
print jsonData['sku']
{
"sku": "218009200",
"fullTitle": "iPhone 7 Apple 32GB Preto Matte 4G Tela 4.7”Retina - Câm. 12MP + Selfie 7MP iOS 11 Proc. Chip A10",
"baseUrl": "https://www.myurl.com.br",
"variationPath": "iphone-7-apple-32gb-preto-matte-4g-tela-4-7-retina-cam-12mp-selfie-7mp-ios-11-proc-chip-a10/p/2180092/te/iph7/",
"imageUrl": "https://a-static.mlcdn.com.br/{w}x{h}/iphone-7-apple-32gb-preto-matte-4g-tela-4-7-retina-cam-12mp-selfie-7mp-ios-11-proc-chip-a10/myurl/218009200/f06f03c5ea2ba95deaa3e55e5e0e687e.jpg",
"urlSubcategories": "https://www.myurl.com.br/iphone-7-e-iphone-7-plus/celulares-e-smartphones/s/te/iph7/",
"quantitySellers": 1,
"categoryId": "te",
"serviceUrl": "/produto/garantia-plus/?product=218009200&marketplaceSellerId=myrul&productdiscountPrice=3199.00&productCashPrice=2879.10&productQuantity=10",
"title": "iPhone 7 Apple 32GB Preto Matte 4G Tela 4....",
"serviceUrl": "/produto/garantia-plus/?product=218009200&marketplaceSellerId=myurl&productdiscountPrice=3199.00&productCashPrice=2879.10&productQuantity=10",
"bestPriceTemplate": " 2.879,10",
"installmentQuantity": "10",
"buyTogetherImage": "https://a-static.mlcdn.com.br/195x145/iphone-7-apple-32gb-preto-matte-4g-tela-4-7-retina-cam-12mp-selfie-7mp-ios-11-proc-chip-a10/myurl/218009200/f06f03c5ea2ba95deaa3e55e5e0e687e.jpg",
"thumbailBuyTogether": "https://a-static.mlcdn.com.br/70x90/iphone-7-apple-32gb-preto-matte-4g-tela-4-7-retina-cam-12mp-selfie-7mp-ios-11-proc-chip-a10/myurl/218009200/f06f03c5ea2ba95deaa3e55e5e0e687e.jpg",
"list_price_price_parcel_cash_price": "list_price_price_parcel_cash_price.html",
"listPrice": " 3.499,90",
"installmentAmount": " 319,90",
"priceTemplate": " 3.199,00",
"seller": "myurl",
"attributes": [{"label":"Cor","value":"Preto Matte","type":"color","id":"218009200","image":"https:\/\/a-static.mlcdn.com.br\/{w}x{h}\/iphone-7-apple-32gb-preto-matte-4g-tela-4-7-retina-cam-12mp-selfie-7mp-ios-11-proc-chip-a10\/myurl\/218009200\/f06f03c5ea2ba95deaa3e55e5e0e687e.jpg","url":"\/iphone-7-apple-32gb-preto-matte-4g-tela-4-7-retina-cam-12mp-selfie-7mp-ios-11-proc-chip-a10\/p\/218009200\/te\/iph7\/","selected":true,"is_delivery_available":true}],
"variations": ["Preto Matte"] }
218009200
解决方法
我正在做一些爬虫,需要用bs4做汤后从返回的div中提取json内容。
所以我现在得到了一个带有该文本ins变量的变量String
<div class="header-product js-header-product" data-product='{
"sku": "218009200","fullTitle": "iPhone 7 Apple 32GB Preto Matte 4G Tela 4.7”Retina - Câm. 12MP + Selfie 7MP iOS 11 Proc. Chip A10","baseUrl": "https://www.myurl.com.br","variationPath": "iphone-7-apple-32gb-preto-matte-4g-tela-4-7-retina-cam-12mp-selfie-7mp-ios-11-proc-chip-a10/p/2180092/te/iph7/","imageUrl": "https://a-static.mlcdn.com.br/{w}x{h}/iphone-7-apple-32gb-preto-matte-4g-tela-4-7-retina-cam-12mp-selfie-7mp-ios-11-proc-chip-a10/myurl/218009200/f06f03c5ea2ba95deaa3e55e5e0e687e.jpg","urlSubcategories": "https://www.myurl.com.br/iphone-7-e-iphone-7-plus/celulares-e-smartphones/s/te/iph7/","quantitySellers": 1,"categoryId": "te","serviceUrl": "/produto/garantia-plus/?product=218009200&marketplaceSellerId=myrul&productDiscountPrice=3199.00&productCashPrice=2879.10&productQuantity=10","title": "iPhone 7 Apple 32GB Preto Matte 4G Tela 4....","serviceUrl": "/produto/garantia-plus/?product=218009200&marketplaceSellerId=myurl&productDiscountPrice=3199.00&productCashPrice=2879.10&productQuantity=10","bestPriceTemplate": " 2.879,10","installmentQuantity": "10","buyTogetherImage": "https://a-static.mlcdn.com.br/195x145/iphone-7-apple-32gb-preto-matte-4g-tela-4-7-retina-cam-12mp-selfie-7mp-ios-11-proc-chip-a10/myurl/218009200/f06f03c5ea2ba95deaa3e55e5e0e687e.jpg","thumbailBuyTogether": "https://a-static.mlcdn.com.br/70x90/iphone-7-apple-32gb-preto-matte-4g-tela-4-7-retina-cam-12mp-selfie-7mp-ios-11-proc-chip-a10/myurl/218009200/f06f03c5ea2ba95deaa3e55e5e0e687e.jpg","list_price_price_parcel_cash_price": "list_price_price_parcel_cash_price.html","listPrice": " 3.499,90","installmentAmount": " 319,"priceTemplate": " 3.199,00","seller": "myurl","attributes": [{"label":"Cor","value":"Preto Matte","type":"color","id":"218009200","image":"https:\/\/a-static.mlcdn.com.br\/{w}x{h}\/iphone-7-apple-32gb-preto-matte-4g-tela-4-7-retina-cam-12mp-selfie-7mp-ios-11-proc-chip-a10\/myurl\/218009200\/f06f03c5ea2ba95deaa3e55e5e0e687e.jpg","url":"\/iphone-7-apple-32gb-preto-matte-4g-tela-4-7-retina-cam-12mp-selfie-7mp-ios-11-proc-chip-a10\/p\/218009200\/te\/iph7\/","selected":true,"is_delivery_available":true}],"variations": ["Preto Matte"] }'> <h1 class="header-product__title" itemprop="name">iPhone 7 Apple 32GB Preto Matte 4G Tela 4.7”Retina - Câm. 12MP + Selfie 7MP iOS 11 Proc. Chip A10</h1> <small class="header-product__code">Código 218009200 <span class="header-product__separator"></span> <a class="header-product__text-interation js-floater-menu-link" href="#anchor-description">Ver descrição completa</a> <span class="header-product__separator"></span> <a class="header-product__text-interation" href="https://www.myurl.com.br/marcas/apple/" itemscope="" itemtype="http://schema.org/Brand"> <span itemprop="name">Apple</span> </a> <meta content="sku:218009200" itemprop="identifier"/> <meta content="http://schema.org/NewCondition" itemprop="itemCondition"/> </small> </div>
我怎样才能像json这样得到json?
"sku": "218009200",
原因在我需要将这些数据提取到数据库之后
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。