-
Notifications
You must be signed in to change notification settings - Fork 1.3k
ESP32-S3: Crash into the HardFault_Handler #6791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Also I’m running almost identical code on another 3.5” tft featherwing except on the bluefruit sense and esp32 airlift. That one works no problem. Neradoc also getting the same crashing result on his s3 tft feather. Someone else just opened up a similar issue 2 days later for the QT Py S2. Specific to Espressif S2/S3? |
Narrowed it down to the wifi or online portions of the code causing it. In it's offline state with no wifi code in it, it's works perfectly fine, can upload code with Mu and use REPL without issue. Parts removed from the sketch. The offender is in the below code somewhere. import time
import ssl
import wifi
import socketpool
import adafruit_requests
try:
from secrets import secrets
except ImportError:
print("Secrets File Import Error")
raise
# OpenWeather 2.5 Free API
MAP_SOURCE = "https://tile.openweathermap.org/map"
MAP_SOURCE += "/precipitation_new"
MAP_SOURCE += "/9"
MAP_SOURCE += "/"+secrets['openweather_lat']
MAP_SOURCE += "/"+secrets['openweather_lon']
MAP_SOURCE += "/.png"
MAP_SOURCE += "&appid="+secrets['openweather_token']
DATA_SOURCE = "https://api.openweathermap.org/data/2.5/onecall?"
DATA_SOURCE += "lat="+secrets['openweather_lat']
DATA_SOURCE += "&lon="+secrets['openweather_lon']
DATA_SOURCE += "&timezone="+timezone
DATA_SOURCE += "&timezone_offset="+str(tz_offset_seconds)
DATA_SOURCE += "&exclude=hourly,daily"
DATA_SOURCE += "&appid="+secrets['openweather_token']
DATA_SOURCE += "&units=imperial"
print("\n===============================")
print("Connecting to WiFi...")
pool = socketpool.SocketPool(wifi.radio)
requests = adafruit_requests.Session(pool, ssl.create_default_context())
wifi.radio.connect(secrets["ssid"], secrets["password"])
print("Connected!\n")
try:
print("Attempting to GET Weather!")
# Uncomment line below to print API URL with all data filled in
#print("Full API GET URL: ", DATA_SOURCE)
print("\n===============================")
response = requests.get(DATA_SOURCE).json()
# uncomment the 2 lines below to see full json response
# dump_object = json.dumps(response)
# print("JSON Dump: ", dump_object)
if int(response['current']['dt']) == "KeyError: example":
print("Unable to retrive data due to key error")
print("most likely OpenWeather Throttling for too many API calls per day")
else:
print("OpenWeather Success")
get_timestamp = int(response['current']['dt'] + tz_offset_seconds)
current_unix_time = time.localtime(get_timestamp)
current_struct_time = time.struct_time(current_unix_time)
current_date = "{}".format(_format_date(current_struct_time))
current_time = "{}".format(_format_time(current_struct_time))
sunrise = int(response['current']['sunrise'] + tz_offset_seconds)
sunrise_unix_time = time.localtime(sunrise)
sunrise_struct_time = time.struct_time(sunrise_unix_time)
sunrise_time = "{}".format(_format_time(sunrise_struct_time))
sunset = int(response['current']['sunset'] + tz_offset_seconds)
sunset_unix_time = time.localtime(sunset)
sunset_struct_time = time.struct_time(sunset_unix_time)
sunset_time = "{}".format(_format_time(sunset_struct_time))
owm_temp = response['current']['temp']
owm_pressure = response['current']['pressure']
owm_humidity = response['current']['humidity']
weather_type = response['current']['weather'][0]['main']
print("Timestamp:", current_date + " " + current_time)
print("Sunrise:", sunrise_time)
print("Sunset:", sunset_time)
print("Temp:", owm_temp)
print("Pressure:", owm_pressure)
print("Humidity:", owm_humidity)
print("Weather Type:", weather_type)
print("\nNext Update in %s %s" % (int(sleep_int), sleep_time_conversion))
print("===============================")
gc.collect()
date_label.text = current_date
time_label.text = current_time
owm_temp_data_shadow.text = "{:.1f}".format(owm_temp)
owm_temp_data_label.text = "{:.1f}".format(owm_temp)
owm_humidity_data_label.text = "{:.1f} %".format(owm_humidity)
owm_barometric_data_label.text = "{:.1f}".format(owm_pressure)
except (ValueError, RuntimeError) as e:
print("Failed to get data, retrying\n", e)
wifi.reset()
time.sleep(60)
continue
response = None I believe it has to do with socketpool, ssl, or requests. Attempting to narrow it down more. |
Ugh this is like hitting a moving target. Backed out of everything, returned to original code.py and everything works fine. This thing is time based, I have to wait about 2 minutes, put Mu in the background, then come back to it in order to replicate. Wait 2 minutes, hit save in Mu and nothing. No response in serial. Says it's saving code.py and confirmed it does but it will not actually auto-reload.
Sits there stuck, no response upon saving. Never reloads. Reset the board and now I'm stuck in a different kind of safe mode crash loop, one that isn't getting output to REPL. Reset the board. Waited 2 minutes, saved, USB disconnects, reconnects, twice. Then I once again see the safe mode crash message.
|
Slimmed it down to this, saved, came back 10 minutes later, hit save again, works perfectly fine. # SPDX-FileCopyrightText: 2022 DJDevon3 for Adafruit Industries
# SPDX-License-Identifier: MIT
"""DJDevon3 UM ESP32 Feather S3 Online Weatherstation"""
import board
import displayio
from adafruit_display_text import label
from adafruit_bitmap_font import bitmap_font
from adafruit_bme280 import basic as adafruit_bme280
from adafruit_hx8357 import HX8357
# 3.5" TFT Featherwing is 480x320
displayio.release_displays()
DISPLAY_WIDTH = 480
DISPLAY_HEIGHT = 320
# Initialize TFT Display
spi = board.SPI()
tft_cs = board.IO1
tft_dc = board.IO3
display_bus = displayio.FourWire(spi,
command=tft_dc,
chip_select=tft_cs,
baudrate=50000000,
phase=0,
polarity=0)
display = HX8357(display_bus, width=DISPLAY_WIDTH, height=DISPLAY_HEIGHT)
# Initialize BME280 Sensor
i2c = board.I2C() # uses board.SCL and board.SDA
bme280 = adafruit_bme280.Adafruit_BME280_I2C(i2c)
# Quick Colors for Labels
text_black = 0x000000
text_blue = 0x0000FF
text_cyan = 0x00FFFF
text_gray = 0x8B8B8B
text_green = 0x00FF00
text_lightblue = 0x90C7FF
text_magenta = 0xFF00FF
text_orange = 0xFFA500
text_purple = 0x800080
text_red = 0xFF0000
text_white = 0xFFFFFF
text_yellow = 0xFFFF00
# Fonts are optional
medium_font = bitmap_font.load_font("/fonts/Arial-16.bdf")
huge_font = bitmap_font.load_font("/fonts/GoodTimesRg-Regular-121.bdf")
temp_label = label.Label(medium_font)
temp_label.anchor_point = (1.0, 1.0)
temp_label.anchored_position = (475, 145)
temp_label.scale = 2
temp_label.color = text_white
temp_data_label = label.Label(huge_font)
temp_data_label.anchor_point = (0.5, 1.0)
temp_data_label.anchored_position = (DISPLAY_WIDTH / 2, 200)
temp_data_label.scale = 1
temp_data_label.color = text_orange
temp_data_shadow = label.Label(huge_font)
temp_data_shadow.anchor_point = (0.5, 1.0)
temp_data_shadow.anchored_position = (DISPLAY_WIDTH / 2 + 2, 200 + 2)
temp_data_shadow.scale = 1
temp_data_shadow.color = text_black
humidity_label = label.Label(medium_font)
humidity_label.anchor_point = (0.0, 1.0)
humidity_label.anchored_position = (5, DISPLAY_HEIGHT - 23)
humidity_label.scale = 1
humidity_label.color = text_gray
humidity_data_label = label.Label(medium_font)
humidity_data_label.anchor_point = (0.0, 1.0)
humidity_data_label.anchored_position = (5, DISPLAY_HEIGHT)
humidity_data_label.scale = 1
humidity_data_label.color = text_orange
barometric_label = label.Label(medium_font)
barometric_label.anchor_point = (1.0, 1.0)
barometric_label.anchored_position = (470, DISPLAY_HEIGHT - 27)
barometric_label.scale = 1
barometric_label.color = text_gray
barometric_data_label = label.Label(medium_font)
barometric_data_label.anchor_point = (1.0, 1.0)
barometric_data_label.anchored_position = (470, DISPLAY_HEIGHT)
barometric_data_label.scale = 1
barometric_data_label.color = text_orange
# Load Bitmap to tile grid first (Background layer)
DiskBMP = displayio.OnDiskBitmap("/images/Astral_Fruit_8bit.bmp")
tile_grid = displayio.TileGrid(
DiskBMP,
pixel_shader=DiskBMP.pixel_shader,
width=1,
height=1,
tile_width=DISPLAY_WIDTH,
tile_height=DISPLAY_HEIGHT)
# Create subgroups
text_group = displayio.Group()
text_group.append(tile_grid)
temp_group = displayio.Group()
main_group = displayio.Group()
# Add subgroups to main display group
main_group.append(text_group)
main_group.append(temp_group)
# Label Display Group (foreground layer)
temp_group.append(temp_label)
temp_group.append(temp_data_shadow)
temp_group.append(temp_data_label)
text_group.append(humidity_label)
text_group.append(humidity_data_label)
text_group.append(barometric_label)
text_group.append(barometric_data_label)
display.show(main_group)
while True:
temp_label.text = "°F"
temperature = "{:.1f}".format(bme280.temperature * 1.8 + 32)
temp_data_shadow.text = temperature
temp_data_label.text = temperature
humidity_label.text = "Humidity"
humidity_data_label.text = "{:.1f} %".format(bme280.relative_humidity)
barometric_label.text = "Pressure"
barometric_data_label.text = "{:.1f}".format(bme280.pressure) Used ctrl+d and ctrl+c a couple times which it normally will fail using. Hit save a couple times, etc.. Everything works fine. Now it seems code related. In case you spotted the double use of Waited another 10 minutes, saved again. Everything fine. Waited another hour, still fine. While saving when it's working the USB doesn't reload or provide that windows USB popup. It just works. |
Ran some tests on Adafruit Feather ESP32-S2 with the same code. Issue is not present on the S2. |
Updated to nightly build hoping it would sort some stuff out after UM pushed a PR for LDO pin and some other bug hunt stuff going on recently. Still getting the same issue. Still crashing with hard fault to safe mode. After a couple minutes after hard reset Mu doesn't seem to be able to save to the board. It's hanging during the save process and waits to reload forever, indefinitely. Ctrl+C or Ctrl+D unresponsive. All kinds of issues, board is practically unusable in this state. Adafruit CircuitPython 8.0.0-beta.0-47-gae64f9fd7 on 2022-09-14; FeatherS3 with ESP32S3 Code stopped by auto-reload. Reloading soon. |
Oops, was using 7.3.3 libraries. Added in 8.x libraries and it's working better. It refuses to reconnect to wifi if using Mu to soft-reload. It'll never connect. Hit reset and keep Mu closed and it will connect to wifi every time. Can only use Mu for serial monitoring after a hard reset. As soon as you hit save, auto reload, it'll never connect again. Haven't hit safe mode & hard fault with new 8.x 9/14 bundle libraries yet. It's still getting stuck on auto-reload load though. :( Must continue to hit hard reset and this board is deep in my flock box enclosure. I need to mount a reset button on my enclosure. |
Based on your testing this afternoon, should we close this? |
I'm hesitant to close this just yet, only did about 5 mins worth of tests which isn't enough time for this bug especially knowing others are having very similar hard faults. It'll be great if all the hard faults end up being USB hub or controller issues. I'd like to spend at least a couple days with it just in case. Also have some more tests to do on the rp2040, m4 express, and S2 as well. |
Issue persists. Still crashing to safe mode and hard faulting. Windows continually reports an error with the drive, there's some kind of issue with the way windows is handling this particular drive vs all of my other boards. It's also the only board that's crashing with hard faults. |
Do you have another S3 board to test? You might just try reformatting the filesystem too, with |
I have a lot of different boards I can test. RP2040, M0 Wifi, M4 Express, Bluefruit Sense, Pi Pico, ESP32-S2, QT Py S3. |
I do actually have another S3 but it's the same exact UM Feather S3. Bought 2. Plus the QT Py S3 but unsure if that counts? Oh I've run it through the nuking process twice to no avail. |
Upon plugging in the 2nd UM FeatherS3 Windows immediately popped up the same message about there being a problem with the drive. Windows does not like this board. No errors in device manager. Running the default code that comes with the UM FeatherS3 code.py import time, gc, os
import neopixel
import board, digitalio
import feathers3
# Create a NeoPixel instance
# Brightness of 0.3 is ample for the 1515 sized LED
pixel = neopixel.NeoPixel(board.NEOPIXEL, 1, brightness=0.3, auto_write=True, pixel_order=neopixel.RGB)
# Say hello
print("\nHello from FeatherS3!")
print("------------------\n")
# Show available memory
print("Memory Info - gc.mem_free()")
print("---------------------------")
print("{} Bytes\n".format(gc.mem_free()))
flash = os.statvfs('/')
flash_size = flash[0] * flash[2]
flash_free = flash[0] * flash[3]
# Show flash size
print("Flash - os.statvfs('/')")
print("---------------------------")
print("Size: {} Bytes\nFree: {} Bytes\n".format(flash_size, flash_free))
print("Pixel Time!\n")
# Create a colour wheel index int
color_index = 0
# Turn on the power to the NeoPixel
feathers3.set_ldo2_power(True)
# Rainbow colours on the NeoPixel
while True:
# Get the R,G,B values of the next colour
r,g,b = feathers3.rgb_color_wheel( color_index )
# Set the colour on the NeoPixel
pixel[0] = ( r, g, b, 0.5)
# Increase the wheel index
color_index += 1
# If the index == 255, loop it
if color_index == 255:
color_index = 0
# Invert the internal LED state every half colour cycle
feathers3.led_blink()
# Sleep for 15ms so the colour cycle isn't too fast
time.sleep(0.015)
Code works fine but it's not using wifi. While the first hard faulting feather is plugged in I can use the 2nd feather no problem. 2nd feather seems fine. It does take much longer than normal for it to soft-reload. Like 5-10 seconds. It's slow to boot compared to most other boards I own, it does however have a massive amount of flash and ram compared to all other boards too. Pasted over the same script from the first feather and it ran. Throwing some errors for missing libs because I didn't have it plugged into the tft featherwing. Plugged into the tft featherwing, rebooted and it immediately crashed into the hard fault handler. Removed it from the featherwing. Removed all display related stuff from the script. Reset, ran again, still crashing to hard fault handler. Saw that I didn't unplug the crashing feather, transferred images/icons/font/libs to 2nd feather, then unplugged crashing feather... Same scripts and everything duplicated to the 2nd UM FeatherS3 works fine. Can soft reload no problem. REPL is responsive nothing out of the ordinary. Wait 5 minutes, hit save in Mu again on the 2nd S3 and it crashes to the HardFault_Hander. There is a problem with UM FeatherS3's. Will try harder to narrow down the problem. |
You might check with @UnexpectedMaker at their site about this. |
This code does not crash. Unused imports compared to crashing script are: pretty sure I can remove all the display related stuff and it'll be fine. # SPDX-FileCopyrightText: 2022 DJDevon3 for Adafruit Industries
# SPDX-License-Identifier: MIT
"""DJDevon3 UM ESP32 Feather S3 Online Weatherstation"""
import gc
import supervisor
import time
import board
import feathers3
import displayio
import digitalio
import adafruit_sdcard
import storage
import wifi
import socketpool
import adafruit_requests
from adafruit_hx8357 import HX8357
# USB Power Sensing
usb_sense = supervisor.runtime.serial_connected
# 3.5" TFT Featherwing is 480x320
displayio.release_displays()
DISPLAY_WIDTH = 480
DISPLAY_HEIGHT = 320
# Initialize WiFi Pool (This should always be near the top of a script!)
# anecdata: you only want to do this once early in your code pool.
# Highlander voice: "There can be only one pool"
pool = socketpool.SocketPool(wifi.radio)
# Time between weather updates
# 900 = 15 mins, 1800 = 30 mins, 3600 = 1 hour
sleep_time = 900
# Initialize TFT Display
spi = board.SPI()
tft_cs = board.IO1
tft_dc = board.IO3
display_bus = displayio.FourWire(spi,
command=tft_dc,
chip_select=tft_cs)
display = HX8357(display_bus, width=DISPLAY_WIDTH, height=DISPLAY_HEIGHT)
# Initialize SDCard on TFT Featherwing
cs = digitalio.DigitalInOut(board.IO33)
sdcard = adafruit_sdcard.SDCard(spi, cs)
vfs = storage.VfsFat(sdcard)
virtual_root = "/sd"
storage.mount(vfs, virtual_root)
Bat_S3 = feathers3.get_battery_voltage()
# print(Bat_S3)
try:
from secrets import secrets
except ImportError:
print("Secrets File Import Error")
raise
if sleep_time < 60:
sleep_time_conversion = "seconds"
sleep_int = sleep_time
elif 60 <= sleep_time < 3600:
sleep_int = sleep_time / 60
sleep_time_conversion = "minutes"
elif 3600 <= sleep_time < 86400:
sleep_int = sleep_time / 60 / 60
sleep_time_conversion = "hours"
else:
sleep_int = sleep_time / 60 / 60 / 24
sleep_time_conversion = "days"
def _format_datetime(datetime):
return "{:02}/{:02}/{} {:02}:{:02}:{:02}".format(
datetime.tm_mon,
datetime.tm_mday,
datetime.tm_year,
datetime.tm_hour,
datetime.tm_min,
datetime.tm_sec,
)
def _format_date(datetime):
return "{:02}/{:02}/{:02}".format(
datetime.tm_year,
datetime.tm_mon,
datetime.tm_mday,
)
def _format_time(datetime):
return "{:02}:{:02}".format(
datetime.tm_hour,
datetime.tm_min,
# datetime.tm_sec,
)
while True:
gc.collect()
# Changes battery voltage color depending on charge level
print("USB Sense: ", usb_sense)
if usb_sense:
print("USB connected")
if not usb_sense:
print("USB not connected")
time.sleep(sleep_time) |
Tried narrowing it down more by removing all display related stuff. # SPDX-FileCopyrightText: 2022 DJDevon3 for Adafruit Industries
# SPDX-License-Identifier: MIT
"""DJDevon3 UM ESP32 Feather S3 Online Weatherstation"""
import gc
import supervisor
import time
import feathers3
import ssl
import wifi
import socketpool
import adafruit_requests
# USB Power Sensing
usb_sense = supervisor.runtime.serial_connected
# Initialize WiFi Pool (This should always be near the top of a script!)
# anecdata: you only want to do this once early in your code pool.
# Highlander voice: "There can be only one pool"
pool = socketpool.SocketPool(wifi.radio)
# Time between weather updates
# 900 = 15 mins, 1800 = 30 mins, 3600 = 1 hour
sleep_time = 900
Bat_S3 = feathers3.get_battery_voltage()
# print(Bat_S3)
try:
from secrets import secrets
except ImportError:
print("Secrets File Import Error")
raise
if sleep_time < 60:
sleep_time_conversion = "seconds"
sleep_int = sleep_time
elif 60 <= sleep_time < 3600:
sleep_int = sleep_time / 60
sleep_time_conversion = "minutes"
elif 3600 <= sleep_time < 86400:
sleep_int = sleep_time / 60 / 60
sleep_time_conversion = "hours"
else:
sleep_int = sleep_time / 60 / 60 / 24
sleep_time_conversion = "days"
# Fill OpenWeather 2.5 API with token data
# OpenWeather free account & token are required
timezone = secrets['timezone']
tz_offset_seconds = int(secrets['timezone_offset'])
# OpenWeather 2.5 Free API
MAP_SOURCE = "https://tile.openweathermap.org/map"
MAP_SOURCE += "/precipitation_new"
MAP_SOURCE += "/9"
MAP_SOURCE += "/" + secrets['openweather_lat']
MAP_SOURCE += "/" + secrets['openweather_lon']
MAP_SOURCE += "/.png"
MAP_SOURCE += "&appid=" + secrets['openweather_token']
DATA_SOURCE = "https://api.openweathermap.org/data/2.5/onecall?"
DATA_SOURCE += "lat=" + secrets['openweather_lat']
DATA_SOURCE += "&lon=" + secrets['openweather_lon']
DATA_SOURCE += "&timezone=" + timezone
DATA_SOURCE += "&timezone_offset=" + str(tz_offset_seconds)
DATA_SOURCE += "&exclude=hourly,daily"
DATA_SOURCE += "&appid=" + secrets['openweather_token']
DATA_SOURCE += "&units=imperial"
def _format_datetime(datetime):
return "{:02}/{:02}/{} {:02}:{:02}:{:02}".format(
datetime.tm_mon,
datetime.tm_mday,
datetime.tm_year,
datetime.tm_hour,
datetime.tm_min,
datetime.tm_sec,
)
def _format_date(datetime):
return "{:02}/{:02}/{:02}".format(
datetime.tm_year,
datetime.tm_mon,
datetime.tm_mday,
)
def _format_time(datetime):
return "{:02}:{:02}".format(
datetime.tm_hour,
datetime.tm_min,
# datetime.tm_sec,
)
# Connect to Wi-Fi
print("\n===============================")
print("Connecting to WiFi...")
requests = adafruit_requests.Session(pool, ssl.create_default_context())
while not wifi.radio.ipv4_address:
try:
wifi.radio.enabled = False
wifi.radio.enabled = True
wifi.radio.connect(secrets['ssid'], secrets['password'])
except ConnectionError as e:
print("Connection Error:", e)
print("Retrying in 10 seconds")
time.sleep(10)
gc.collect()
print("Connected!\n")
while True:
gc.collect()
# Changes battery voltage color depending on charge level
print("USB Sense: ", usb_sense)
if usb_sense:
print("USB connected")
if not usb_sense:
print("USB disconnected")
try:
print("Attempting to GET Weather!")
# Uncomment line below to print API URL with all data filled in
# print("Full API GET URL: ", DATA_SOURCE)
print("\n===============================")
response = requests.get(DATA_SOURCE).json()
# uncomment the 2 lines below to see full json response
# dump_object = json.dumps(response)
# print("JSON Dump: ", dump_object)
if int(response['current']['dt']) == "KeyError: example":
print("Unable to retrive data due to key error")
print("most likely OpenWeather Throttling for too many API calls per day")
else:
print("OpenWeather Success")
get_timestamp = int(response['current']['dt'] + tz_offset_seconds)
current_unix_time = time.localtime(get_timestamp)
current_struct_time = time.struct_time(current_unix_time)
current_date = "{}".format(_format_date(current_struct_time))
current_time = "{}".format(_format_time(current_struct_time))
sunrise = int(response['current']['sunrise'] + tz_offset_seconds)
sunrise_unix_time = time.localtime(sunrise)
sunrise_struct_time = time.struct_time(sunrise_unix_time)
sunrise_time = "{}".format(_format_time(sunrise_struct_time))
sunset = int(response['current']['sunset'] + tz_offset_seconds)
sunset_unix_time = time.localtime(sunset)
sunset_struct_time = time.struct_time(sunset_unix_time)
sunset_time = "{}".format(_format_time(sunset_struct_time))
owm_temp = response['current']['temp']
owm_pressure = response['current']['pressure']
owm_humidity = response['current']['humidity']
weather_type = response['current']['weather'][0]['main']
print("Timestamp:", current_date + " " + current_time)
print("Sunrise:", sunrise_time)
print("Sunset:", sunset_time)
print("Temp:", owm_temp)
print("Pressure:", owm_pressure)
print("Humidity:", owm_humidity)
print("Weather Type:", weather_type)
print("\nNext Update in %s %s" % (int(sleep_int), sleep_time_conversion))
print("===============================")
gc.collect()
except (ValueError, RuntimeError) as e:
print("Failed to get data, retrying\n", e)
time.sleep(60)
continue
response = None
time.sleep(sleep_time) This produces the most interesting result yet. An error I've never seen before. Refuses to connect to wifi, maybe I missed something? ===============================
Connecting to WiFi...
Connection Error: Unknown failure 205
Retrying in 10 seconds
Connection Error: Unknown failure 205
Retrying in 10 seconds
Connection Error: Unknown failure 205
Retrying in 10 seconds
Connection Error: Unknown failure 205
Retrying in 10 seconds
Connection Error: Unknown failure 205
Retrying in 10 seconds
Connection Error: Unknown failure 205
Retrying in 10 seconds
Connection Error: Unknown failure 205
Retrying in 10 seconds The error handler for adafruit_requests is catching a 205... what's a 205? |
Narrowed down the 205 as much as possible. # SPDX-FileCopyrightText: 2022 DJDevon3 for Adafruit Industries
# SPDX-License-Identifier: MIT
# UM ESP32 FeatherS3
import wifi
import socketpool
# Initialize WiFi Pool
pool = socketpool.SocketPool(wifi.radio)
try:
from secrets import secrets
timezone = secrets['timezone']
tz_offset_seconds = int(secrets['timezone_offset'])
except ImportError:
print("Secrets File Import Error")
raise
# Connect to Wi-Fi
print("\n===============================")
print("Connecting to WiFi...")
while not wifi.radio.ipv4_address:
try:
wifi.radio.connect(secrets['ssid'], secrets['password'])
except ConnectionError as e:
print("Connection Error:", e)
print("Retrying in 10 seconds")
# With the 205 Error it never reaches below this point
print("Connected!\n") Oh duh because it's being caught by the wifi error handler the 205 error is internally wifi library related. Well at least I was able to narrow that one down. Not crashing into the hardfault_handler though. I give up. ===============================
Connecting to WiFi...
Connection Error: Unknown failure 205
Retrying in 10 seconds
Connection Error: Unknown failure 205 |
I'm kinda confused by this issue - is this about hard faults or wifi issues? |
Re the Wifi - your last code example - does that work on the QTPy S3? |
|
@UnexpectedMaker It's really about both wifi & hardfault_handler since I've been unable to set off the hard fault handler without wifi in the mix. I'm sure wifi is related and triggering the issue somehow. Yes the small sketch works on the Qt Py S3. Adafruit CircuitPython 8.0.0-alpha.1 on 2022-06-09; Adafruit QT Py ESP32-S3 no psram with ESP32S3 # secrets.py
secrets = {
"ssid": "Your SSID",
"password": "Your Wifi Password",
} # code.py (minimal sketch for wifi and hard fault handler test)
import wifi
import socketpool
# Initialize WiFi Pool
pool = socketpool.SocketPool(wifi.radio)
try:
from secrets import secrets
except ImportError:
print("Secrets File Import Error")
raise
# Connect to Wi-Fi
print("\n===============================")
print("Connecting to WiFi...")
while not wifi.radio.ipv4_address:
try:
wifi.radio.connect(secrets['ssid'], secrets['password'])
except ConnectionError as e:
print("Connection Error:", e)
# With the 205 Error it never reaches below this point
print("Connected!\n") Connected first attempted, consecutive reloads also connected. Auto-reload is on. Simply save files over USB to run them or enter REPL to disable.
code.py output:
===============================
Connecting to WiFi...
Connected!
Code done running. Same code running on the UM FeatherS3 ===============================
Connecting to WiFi...
Connection Error: Unknown failure 205
Connection Error: Unknown failure 205
Connection Error: Unknown failure 205
Connection Error: Unknown failure 205
Connection Error: Unknown failure 205
Connection Error: Unknown failure 205 Takes about 15-30 seconds per attempt. |
This is the 2nd UM FeatherS3. Nothing attached. Exhibiting the same symptoms and behavior as the 1st featherS3 in this scenario. It also hard fault crashes with wifi, issue connecting with the smaller test sketch (205 error). Isn't attached to a TFT, sensor, or battery. Same behavior as the other one. yeah i really need to clean up the flux. i have the luxury of inspecting it in my hand from different angles, joints are good. First FeatherS3. Had it in this configuration for months. I love my little weatherstation. Just went through a hurricane with it btw. |
Interesting that worked for months for you - I know not your choice, but you have the hi speed signal wires of the screen sitting at the antenna - that's going to cause grief. Plus it's inside a case, reducing the antenna gain even more, and you have an I2C device hanging off the back next to the antenna - it's a worst case setup there - I'm surprised that ever worked for you at all ;) That said, if it was working - and now it's not... I'm still perplexed as to what the reported problem is about. Wifi was working fine and now isn't? If the HW hasn't changed though all of that... and there have only been SW and firmware level changes - I'm not sure where I'm supposed to giving input on this? |
It looks like your AP is just failing to assign the FS3 an IP address. |
Oh I said it was in that configuration since the day I got it. Didn't say it ever worked like the rest of my boards. I used to have an Adafruit S2 in there and that worked. Put your featherS3 in there and it's been non-stop reaching in there to hit the reset button. Can plug it in, run it once, and that works if I don't ever touch it again. That's how I have a pretty picture of it actually working. Try to work with the code in Mu or touch the USB drive to open a file on it and it crashes to the hard fault handler or wifi only works after hard reset (two different problems) at least in my main weatherstation sketch. It's like reaching into a bag of angry cats trying to get it to work right, only works right once after hard reset then crash into hard fault handler if I do anything else. It's NOT doing that with the tiny test sketch I just came up with in here, but it's also not connecting to wifi now which is the correlation. I'm sure if I can get it to connect to wifi it'll crash into the hardfault_handler again. So yeah that's the main issue. Good suggestions, will look into them asap. Rebooted the router and it did eventually connect. Connection Error: Unknown failure 205
Connection Error: Unknown failure 205
Connected! Now to wait and see if I can get it to hard fault as I suspect it's the CP internal wifi library triggering it somehow. Usually it takes about 5 minutes of not touching it and then it will hard fault. Windows reports an error with the drive on both FS3's. Waiting I think is triggering some kind of USB timeout/reload in Windows. The drive disappears, reconnects, I have to bring up REPL again and it'll show a crash into the hard fault handler... that's usually what happens. Still waiting for it to happen with the minimal test sketch. |
You wrote "First FeatherS3. Had it in this configuration for months. I love my little weatherstation. Just went through a hurricane with it btw." Now you say it never worked properly? I'm so confused and really have no idea where to go to from here. |
When code exits, wifi deinits and the connection is dropped. |
Found possible old bug correlation #3712 Todbot reported exact symptoms I have. Maybe the bug came back? Possible the reason I'm having this issue and most aren't is because my AP's also have DHCP disabled. My router handles DHCP and vlans. AP --> 24 port switch --> firewall --> router To request anything wirelessly that's the path. Maybe a race condition or deauth not fast enough? |
My APs don't have DHCP either. Depending which AP a device connects to, it's path is: device --> one of several APs --> [some paths via one or more of several managed switches] --> router with DHCP. |
Until 6866 is resolved, I'd suggest running CP8 as follows:
wifi.radio.enabled = False
wifi.radio.enabled = True (edit: #2 isn't a reproducible fix for connecting after reload) |
I’ll have to add some exceptions to my firewall which is a pain to get at and in disarray after moving into a new place. Intend on running Ethernet this winter when my attic isn’t 120F death trap. Might take me a while to get at the firewall stuff. I can’t run web workflow wired and especially not wirelessly because port 80 is taken. Melissa’s guide says it doesn’t like running on anything but port 80 right now. That’s a problem for me and most lan admins even if it’s only on a loopback. I’d have to reorganize a good portion of my lan just to test web workflow. That’s beside the point of not being keen on poking holes in my firewall for IOT devices to access my maker pc and the internet. While I love the idea of web workflow you have to admit it’s going to change the landscape of iot vlans, reconfiguration is required. I wish it was as simple as just throwing creds into a env file but it’s really not. Summary of this issue to date:
Hard fault bug investigation has to be put on hold until the hard reset bug is fixed. Too many variables and getting at it behind the hard reset bug not worth the insanity it’s causing. |
I infer from the guide that it's just the online Code Editor that hasn't been tested on other ports. The local file browser and code editor should be fine. But you don't even need to use any of those tools to get the connection and scanning benefit. Simply having web workflow enabled seems to improve scanning and connection behavior. Web workflow could be enabled on an alternate port, like 8080, in the Addendum: I just changed half dozen of mine to port 8088, and went through reset then several reload cycles on each, and they don't seem to be exhibiting the abbreviated scans or failure to connect issues. |
The Oct 11th, 2022 nightly build fixed the hard-reset bug on the UM FeatherS3 as well. Pat on the back to DanH & anecdata, and anyone else involved in tracking it down. If this hard-fault bug (different than hard-reset bug) does still exist it should get exposed pretty quick in the next couple days. Because I'm using the UM FeatherS3 as my main board in my main long term project if it does hard fault I'll notice it immediately on my weatherstation display. I'd like a couple days to put it through paces, should be plenty of time. If all goes well I will gleefully close. |
Woke up this morning to find my weatherstation stopped at 11:37pm last night. Look at the modified file timestamp of code.py and it says 10:06pm, that's my last save before going to bed for the night. It automatically updates every 15 minutes GET data from openweathermaps. Only had a chance to successfully run about 3 times before crashing. The HardFault bug as I suspected persisted through the fix of the hard-reset bug fix. It's pretty easy to notice approximately when it crashed after the last update. The timestamp is the last successful weather update (every 15 minutes). ]0;🐍Wi-Fi: off | Done | 8.0.0-beta.1-29-gfed884738\Auto-reload is off.
Running in safe mode! Not running saved code.
You are in safe mode because:
CircuitPython core code crashed hard. Whoops!
Crash into the HardFault_Handler.
Please file an issue with the contents of your CIRCUITPY drive at
https://github.com/adafruit/circuitpython/issues
Press any key to enter the REPL. Use CTRL-D to reload.
]0;🐍Wi-Fi: off | Done | 8.0.0-beta.1-29-gfed884738\ CTRL+C or CTRL+D are unreponsive. Hard reset is required to get it out of safemode. Now running: Adafruit CircuitPython 8.0.0-beta.1-29-gfed884738 on 2022-10-11; FeatherS3 with ESP32S3
Board ID:unexpectedmaker_feathers3 This version has definitely fixed the hard reset bug thankfully but the HardFault bug persists. I can probably get it to HardFault faster if I remove the 15 minute update and set it to 2 minutes. OpenWeather does have an API limit which I believe is about 1.5 minutes per call. At least in replicating my exact error there won't be any super fast iterations possible due to the openweather api limit. |
RedBeard in discord suggested a possible correlation between I2C and HardFaults in multiple bug reports across the board. Unplugged I2C BME280, removed code, and set weather update interval to every 3 minutes beginning at 2:02pm. It's now 2:23pm and no hard faults yet. If it is a memory leak of some kind it might take a while. |
Ran for 2 hours without issue. Tried removing some code to setup an I2C only example and it HardFaulted on hitting save in Mu. It's got something to do with USB & file save after a certain period of time. |
Ported some of this project from the UM Feather S3 + TFT featherwing to an Adafruit Feather S3 4mb/2mb + RGBmatrix panel + RGB matrix featherwing. It's also crashing into the hard fault handler. Reliably. Only experimented with it for a couple hours last night. Initial impression is the timing related bug is still present on 8.0.0 beta 4. Fault handler crashes to safe mode which requires a hard reset. It only crashes when wifi is added to the code, was trying pull API data from an online source (openweathermaps). Using wifi and displayio in both scripts/scenarios on different hardware. The only common denominator I'm aware of between the 2 different sets of hardware is the S3 chip. Still looking into it. Just an update from a beta testing perspective. Another correlation is after hard reset it will run no problem and trying to make a change and save in Mu triggers the hard fault. |
Can you summarize the situations where it crashes?
|
Let's close this and open a new issue that's more precise if this comes up again. It's really hard to follow what's going on here. |
I haven't revisited my weather station project because of that bug, it completely derailed my project. The bug was in 7.x and continued into 8.x beta. We're now past the 8.x stable release date. It's been permanently off for months. Got sidetracked with other projects. I'll make it a point to load this one back up tonight and test with the current stable version of 8.0.5. I am still hesitant to close this issue as it was a doozy of a bug. Will check and report back. |
I switched to the Adafruit ESP32-S3 which was exhibiting the same symptoms at the time during 7.x and 8.0 beta. Gave up on it at that point. Adafruit CircuitPython 8.0.5 on 2023-03-31; Adafruit Feather ESP32S3 4MB Flash 2MB PSRAM with ESP32S3
Board ID:adafruit_feather_esp32s3_4mbflash_2mbpsram Been running like a dream for an hour. Everything is much much faster too. I don't want to swap the UM FeatherS3 back in because it's deep inside an enclosure and would take about 20 minutes just to swap the boards. Can't say for certain but definitely feels like the hard fault bug with the S3's has been resolved judging from how the Adafruit ESP32-S3 is functioning. I'll take another shot at running a short demo on the UM FeatherS3 later. The weather station TFT is sitting on my desk right in front of my main monitor 24/7. If it crops up again I'll definitely know it and will resubmit. Closing |
I did eventually get back to testing the UM FeatherS3 running 24 hours without a single error let alone a hard fault. This issue is now 100% confirmed closed on 8.0.5 stable release. |
CircuitPython version
Code/REPL
Behavior
Auto-reload is off.
Running in safe mode! Not running saved code.
You are in safe mode because:
CircuitPython core code crashed hard. Whoops!
Crash into the HardFault_Handler.
Please file an issue with the contents of your CIRCUITPY drive at
https://github.com/adafruit/circuitpython/issues
Press any key to enter the REPL. Use CTRL-D to reload.
Adafruit CircuitPython 7.3.2 on 2022-07-20; FeatherS3 with ESP32S3
Description
Issue documented in discord dev channel. https://discord.com/channels/327254708534116352/327298996332658690/1010913898708336681
Additional information
Issue is worse on 8.0 beta than 7.3 because it still crashes to safe mode but without any error describing it in repl like it shows in 7.3. but issue occurs in both UF2 versions.
after a certain period of time when i hit save in Mu the USB drive disconnects, reconnects, i pull up repl immediately, and it shows that message. usually about 2-3 minutes. i have to physically hit reset on the board, get to save my code once or twice before 2-3 minutes, and the process repeats itself. It’s a safe mode crashing loop.
Here's the full contents of my CIRCUITPY drive, including /lib and everything. https://github.com/DJDevon3/CircuitPython/tree/main/UM_ESP32-S3%20Online-Offline%20Weatherstation
The text was updated successfully, but these errors were encountered: