Categories
PowerShell

Automating (and Debugging) File Downloads with PowerShell

If you’ve ever used PowerShell you’ll know how great it is for automating all those things you really don’t want to do more than once. Recently I needed to grab a file from a web page so we didn’t have folks to be downloading and copying it around manually. Its a super simple use case which makes life easier for everyone.

So lets get a script pulled together to do just that – now this page happens to have multiple documents on there covering different periods of data, and I’m just checking for whenever the latest one gets posted. The script goes and gets the page, finds the top link, checks if we’ve already seen that one and if not it’ll download it.

# Set up some parameters to use
$DownloadLocation = "D:\Storage\3rdParty\Inbound\"
$PageWithLinks = "https://www.someone.org/PageWithLinks.asp"

# Get the contents of the page, find the link to the file and then split the file name from it
$PageContents = Invoke-WebRequest -Uri $PageWithLinks
$FirstExcelLink = $PageContents.Links | Where-Object { $_.href -like "*.xlsx" } | Select-Object -First 1
$FileName = Split-Path $FirstExcelLink.href -Leaf

# Check if we already have the file and if not then download it
if (!(Test-Path -Path ($DownloadLocation + $FileName))) {
   Invoke-WebRequest -Uri $FirstExcelLink.href -OutFile ($DownloadLocation + $FileName)
}

It’s great, works a treat. PowerShell is awesome 😀

The Catch

Yea it works a treat. On my machine (isn’t it always the way!)

So the time comes to put this into our automation process and guess what – nothing. We’re getting no errors thrown up and our inbound folder is empty. Hmm that’s interesting – if I try to run the script as myself on the server it’s fine. I try to add my credentials into the script as a test, maybe its proxy related for example – nope, that doesn’t help things. The automation runs this task in the background using a service account and its not retaining any logs to look into, so I need another way to see what’s going on in there!

Searching for a solution I find this great post by Bill Kindle about PowerShell transcripting. You can manually start a transcript within your processes to build up logs which I’ve used previously where there’s a need for something to persist but its a rarity (because we’re automating jobs so we don’t need to check them right?). In this case though we know the script works so I want to know why it’s not working in this specific implementation, and what caught my eye was enabling transcripting at the server level which will automatically output logs for almost any PowerShell being ran.

A few registry keys later and we’re set up and ready to go. Firing up a prompt for a test command Get-Date and ping we’ve got a log! Open ISE which starts caching intellisense and ping, another log! Fantastic! Firing off the automation again we get another log file which unearths the culprit:

TerminatingError(Invoke-WebRequest): “The response content cannot be parsed because the Internet Explorer engine is not available, or Internet Explorer’s first-launch configuration is not complete. Specify the UseBasicParsing parameter and try again. “

Of course, it was Internet Explorer. Exactly what I’d needed to see, and a really easy fix. A quick update of the script to add the UseBasicParsing switch was all it needed for everything to start working like a dream. Transcripts turned off, automation in place, job done, one more thing to look after itself.

Leave a comment