If you use Selenium for automation you may need to get the content of the whole page. This can be done easily with Selenium by one line of code like:

  • python
driver.page_source

or java / groovy

driver.getPageSource();

You can get only the text of the body which should be the visible text on the page with:

  • python
element = driver.find_element_by_tag_name("body")
element.get_attribute('innerHTML')
  • java / groovy
element.getAttribute("innerHTML");

The code above is working in the most cases but may fail for some ( like HtmlUnitDriver). You can use another code which will result in similar output but it will work more widely:

WebElement element = driver.findElement(By.id("foo"));
 String contents = (String)((JavascriptExecutor)driver).executeScript("return arguments[0].innerHTML;", element); 

Full example for python:

from selenium import webdriver

driver = webdriver.Chrome('./chromedriver_linux64/chromedriver')
driver.maximize_window()
driver.get("https://www.google.com/ncr")
print (driver.find_element_by_tag_name("body").text)

result:

Gmail
Images
Sign in
Google offered in: french
A privacy reminder from Google
REMIND ME LATER
REVIEW NOW
France
PrivacyTermsSettings
AdvertisingBusinessAbout

Note that if you don't provide a link to to your chrome driver you may get an error like:

FileNotFoundError: [Errno 2] No such file or directory: 'chromedriver': 'chromedriver'

os.path.basename(self.path), self.start_error_message)
selenium.common.exceptions.WebDriverException: Message: 'chromedriver' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home