Доступ запрещен при доступе к сайту с использованием скрипта python в режиме heaedess
Этот код отлично работает, когда я запускаю свой ноутбук с Ubuntu. Однако, когда я внедряю то же самое в машину с Ubuntu AWS EC2. Мне отказано в доступе к сайту, который я пытаюсь сканировать. Я много раз менял IP-адрес компьютера AWS, а не блока IP-уровня.
Код для обеспечения драйвера webdrive:
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--window-size=1420,1080')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument('--disable-application-cache')
ua = UserAgent()
userAgent = ua.random
print(userAgent)
chrome_options.add_argument('user-agent={userAgent}')
driver = webdriver.Chrome('/home/ubuntu/chromedriver',chrome_options=chrome_options)
driver.get(link)
print(driver.page_source)
0 ответов
Немного неясно, при каких обстоятельствах, по вашему мнению, вам отказано в доступе к сайту https://www.macys.com/ при попытке сканирования. Однако вам нужно учитывать несколько вещей:
--disable-gpu
только для windows os.В качестве примера я рассмотрел конкретный пользовательский агент как:
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.79 Safari/537.36
Вот результат выполнения:
Блок кода:
from selenium import webdriver options = webdriver.ChromeOptions() options.add_argument('--headless') options.add_argument('--no-sandbox') options.add_argument('--window-size=1420,1080') options.add_argument('--disable-gpu') options.add_argument(f'user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.79 Safari/537.36') options.add_experimental_option("excludeSwitches", ["enable-automation"]) options.add_experimental_option('useAutomationExtension', False) driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe') driver.get("https://www.macys.com/") print(driver.page_source) driver.quit()
Консольный вывод:
[1217/034634.234:INFO:CONSOLE(2022)] "Error: <svg> attribute viewBox: Expected number, "0 0 135px 40px".", source: https://www.macys.com/ (2022) [1217/034635.403:INFO:CONSOLE(1)] "2309,2308", source: https://assets.macysassets.com/page/home-page/static/js/home-page.vendors~header.ca1a9a8ca3949327ad99.js (1) [1217/034636.970:INFO:CONSOLE(0)] "A cookie associated with a cross-site resource at http://demdex.net/ was set without the `SameSite` attribute. A future release of Chrome will only deliver cookies with cross-site requests if they are set with `SameSite=None` and `Secure`. You can review cookies in developer tools under Application>Storage>Cookies and see more details at https://www.chromestatus.com/feature/5088147346030592 and https://www.chromestatus.com/feature/5633521622188032.", source: https://www.macys.com/ (0) [1217/034638.024:INFO:CONSOLE(0)] "A cookie associated with a cross-site resource at http://everesttech.net/ was set without the `SameSite` attribute. A future release of Chrome will only deliver cookies with cross-site requests if they are set with `SameSite=None` and `Secure`. You can review cookies in developer tools under Application>Storage>Cookies and see more details at https://www.chromestatus.com/feature/5088147346030592 and https://www.chromestatus.com/feature/5633521622188032.", source: https://www.macys.com/ (0) <html lang="en"><head class="at-element-marker"> <title>Macy's - Shop Fashion Clothing & Accessories - Official Site - Macys.com</title> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <meta http-equiv="Cache-Control" content="private, max-age=0, no-cache, must-revalidate"> <meta name="description" content="Macy's - FREE Shipping at Macys.com. Macy's has the latest fashion brands on Women's and Men's Clothing, Accessories, Jewelry, Beauty, Shoes and Home Products."> <meta name="keywords" content="department store, dept store, department stores, Macys store, clothing, apparel, clothing store, accessories, macy's department store, macys department stores, macys apparel"> <meta property="og:title" content="Macy’s– Official Site"> <meta property="og:type" content="website"> <meta property="og:url" content="https://www.macys.com"> <meta property="og:description" content="FREE Shipping on the latest fashion brands on Women's and Men's Clothing, Accessories, Jewelry, Beauty, Shoes and Home Products."> <meta property="og:image" content="https://www.macys.com/img/nav/co_macysLogo3.gif"> <meta property="og:site_name" content="Macy's"> <meta property="fb:app_id" content="172562576126509"> <meta name="google-site-verification" content="NXerNZgQYWmrno0UECIRSi5eHUACZ-5TThhQOA3SFvU"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <link rel="canonical" href="https://www.macys.com/"> <link rel="preconnect" href="https://assets.macysassets.com"> <link rel="preconnect" href="https://slimages.macysassets.com"> <link rel="preconnect" href="https://tags.tiqcdn.com"> <link rel="preconnect" href="https://libs.coremetrics.com"> <link rel="preconnect" href="https://dynamic.criteo.com"> <link rel="preconnect" href="https://rscdn.storetail.net"> <link rel="preload" as="style" href="https://assets.macysassets.com/page/home-page/static/css/carousel-ctrl.7608f93d9c891b06d342.css"> <link rel="preload" as="style" href="https://assets.macysassets.com/page/home-page/static/css/mcom.b097920404dcb4038b10.css"> <link rel="preload" as="style" href="https://assets.macysassets.com/page/home-page/static/css/vendors~canvas.c21198c7217ace6f58cd.css"> <link rel="preload" as="style" href="https://assets.macysassets.com/page/home-page/static/css/vendors~dynamic-slideshow-ctrl.6ad1fb956323ce6c391a.css"> <link rel="preload" as="style" href="https://assets.macysassets.com/page/home-page/static/css/vendors~footer.09f550f549f0b44659b0.css"> <link rel="preload" as="style" href="https://assets.macysassets.com/page/home-page/static/css/vendors~viewCompact~viewMinimalist~viewRadical.a3764e3d32ac27f9bd27.css"> <link rel="preload" as="style" href="https://assets.macysassets.com/page/home-page/static/css/vendors~viewCompact~viewRadical.425306252d7663c8b168.css"> <link rel="preload" as="style" href="https://assets.macysassets.com/page/home-page/static/css/vendors~viewFooterResponsive.f1ebc4caa32bc3e086d3.css"> <link rel="preload" as="style" href="https://assets.macysassets.com/page/home-page/static/css/viewCompact.35736d9d70390895a793.css"> <link rel="preload" as="style" href="https://assets.macysassets.com/page/home-page/static/css/viewRadical.58ec252b5ccc3cfcfc40.css"> <link rel="prefetch" as="style" href="https://assets.macysassets.com/page/home-page/static/css/BrowserVersionMessage.ddfbe993b80a3718f14a.css"> <link rel="prefetch" as="style" href="https://assets.macysassets.com/page/home-page/static/css/quickBag.e2ec561bf5f55e22791d.css"> <link rel="prefetch" as="style" href="https://assets.macysassets.com/page/home-page/static/css/vendors~prosFactory.0571235e1704cb0bfff1.css"> <link rel="prefetch" as="style" href="https://assets.macysassets.com/page/home-page/static/css/vendors~responsive-header.fb72e29bd4f59db62409.css"> <link rel="preload" as="script" href="https://assets.macysassets.com/page/home-page/static/js/home-page.vendor.common.13df968a79bd2962d068.js"> <link rel="preload" as="script" href="https://assets.macysassets.com/page/home-page/static/js/home-page.core.vendor.6baca319875c7fae46bc.js"> <link rel="preload" as="script" href="https://assets.macysassets.com/page/home-page/static/js/home-page.mcom.c5f407ff077934f61bc3.js"> <link rel="stylesheet" href="https://assets.macysassets.com/page/home-page/static/css/carousel-ctrl.7608f93d9c891b06d342.css"> <link rel="stylesheet" href="https://assets.macysassets.com/page/home-page/static/css/mcom.b097920404dcb4038b10.css"> <link rel="stylesheet" href="https://assets.macysassets.com/page/home-page/static/css/vendors~canvas.c21198c7217ace6f58cd.css"> <link rel="stylesheet" href="https://assets.macysassets.com/page/home-page/static/css/vendors~dynamic-slideshow-ctrl.6ad1fb956323ce6c391a.css"> <link rel="stylesheet" href="https://assets.macysassets.com/page/home-page/static/css/vendors~footer.09f550f549f0b44659b0.css"> <link rel="stylesheet" href="https://assets.macysassets.com/page/home-page/static/css/vendors~viewCompact~viewMinimalist~viewRadical.a3764e3d32ac27f9bd27.css"> <link rel="stylesheet" href="https://assets.macysassets.com/page/home-page/static/css/vendors~viewCompact~viewRadical.425306252d7663c8b168.css"> <link rel="stylesheet" href="https://assets.macysassets.com/page/home-page/static/css/vendors~viewFooterResponsive.f1ebc4caa32bc3e086d3.css"> <link rel="stylesheet" href="https://assets.macysassets.com/page/home-page/static/css/viewCompact.35736d9d70390895a793.css"> <link rel="stylesheet" href="https://assets.macysassets.com/page/home-page/static/css/viewRadical.58ec252b5ccc3cfcfc40.css"> . . . <iframe sandbox="allow-scripts allow-same-origin" title="Adobe ID Syncing iFrame" id="destination_publishing_iframe_macyscominc_0" src="https://macyscominc.demdex.net/dest5.html?d_nsid=0#https%3A%2F%2Fwww.macys.com%2F" style="display: none; width: 0px; height: 0px;" class="aamIframeLoaded"></iframe><div class="redesign-header-overlay radical" style="display: none;"></div></body></html>