Home Forums Hardware discussions Random Reboots

  • This topic has 9 replies, 1 voice, and was last updated 1 month ago by mu-b.
Viewing 10 posts - 1 through 10 (of 10 total)
  • Author
    Posts
  • #38528
    mu-b
    Participant

    All – I’ve noticed what I suspect is a power related reboot issues on my espressobin v7/1gb. The device is running OpenWRT 18.06.2 and has been for sometime. However it reboots at random intervals with nothing printed to the console to indicate a Kernel fault. I’ve tried numerous versions of u-boot but to no avail. In fact the problem seems alot like https://forum.netgate.com/topic/144636/sg-1100-intermittent-reboots (which I believe is also espressobin based) and as such might be a board problem relating to power.

    Anyone else noticed this or found a stable config?

    (Currently running u-boot 2018.03-devel-18.12.3-gc9aa92c-armbian 1gb-1cs-800-1000, openwrt 18.06.2).

    #38529
    tekrantz
    Participant

    No solution but my espressobin v7 1gb has exactly the same behavior. My other 2 do not, one is from the gofundme and the other is a v5 purchased from amazon. My v7 has just shut itself down 3 times today.

    #38530
    mu-b
    Participant

    Yes it would appear that these failures are not kernel related, no kernel panic at all nor any message except a hard reset of the hardware.

    The following is an example output, espressobinv7 running 800/800 latest u-boot (which randomly fails to boot past ‘SVC REV: 5, CPU VDD voltage: 1.050V’). I’m tempted to build u-boot myself and play around with the VDD marginally with a kernel with cpu governing/powersave disabled.

    These crashes tend to come in blocks, then the device will be stable for a few hours then commence rebooting again. It is not cpu nor memory related as I’ve cross compiled memtester for openwrt and ran that for hours successfully prior to a hard reset.

    TIM-1.0
    WTMI-devel-18.12.1-e6bb176
    WTMI: system early-init
    SVC REV: 5, CPU VDD voltage: 1.050V
    NOTICE: Booting Trusted Firmware
    NOTICE: BL1: v1.5(release):1f8ca7e (Marvell-devel-18.12.2)
    NOTICE: BL1: Built : 16:26:08, May 21 2019
    NOTICE: BL1: Booting BL2
    NOTICE: BL2: v1.5(release):1f8ca7e (Marvell-devel-18.12.2)
    NOTICE: BL2: Built : 16:26:10, May 21 2019
    NOTICE: BL1: Booting BL31
    NOTICE: BL31: v1.5(release):1f8ca7e (Marvell-devel-18.12.2)
    NOTICE: BL31: Built : 16:26:13

    U-Boot 2018.03-devel-18.12.3-gc9aa92c-armbian (Feb 20 2019 – 09:45:04 +0100)

    Model: Marvell Armada 3720 Community Board ESPRESSOBin
    CPU 800 [MHz]
    L2 800 [MHz]
    TClock 200 [MHz]
    DDR 800 [MHz]
    DRAM: 1 GiB
    Comphy chip #0:
    Comphy-0: USB3 5 Gbps
    Comphy-1: PEX0 2.5 Gbps
    Comphy-2: SATA0 6 Gbps
    SATA link 0 timeout.
    AHCI 0001.0300 32 slots 1 ports 6 Gbps 0x1 impl SATA mode
    flags: ncq led only pmp fbss pio slum part sxs
    PCIE-0: Link down
    MMC: sdhci@d0000: 0, sdhci@d8000: 1
    Loading Environment from SPI Flash… SF: Detected mx25u3235f with page size 256 Bytes, erase size 64 KiB, total 4 MiB
    OK
    Model: Marvell Armada 3720 Community Board ESPRESSOBin
    Net: eth0: neta@30000 [PRIME]
    Hit any key to stop autoboot: 2 TIM-1.0
    WTMI-devel-18.12.1-e6bb176
    WTMI: system early-init
    SVC REV: 5, CPU VDD voltage: 1.050V
    NOTICE: Booting Trusted Firmware
    NOTICE: BL1: v1.5(release):1f8ca7e (Marvell-devel-18.12.2)
    NOTICE: BL1: Built : 16:26:08, May 21 2019
    NOTICE: BL1: Booting BL2
    NOTICE: BL2: v1.5(release):1f8ca7e (Marvell-devel-18.12.2)
    NOTICE: BL2: Built : 16:26:10, May 21 2019
    NOTICE: BL1: Booting BL31
    NOTICE: BL31: v1.5(release):1f8ca7e (Marvell-devel-18.12.2)
    NOTICE: BL31: Built : 16:26:13

    U-Boot 2018.03-devel-18.12.3-gc9aa92c-armbian (Feb 20 2019 – 09:45:04 +0100)

    Model: Marvell Armada 3720 Community Board ESPRESSOBin
    CPU 800 [MHz]
    L2 800 [MHz]
    TClock 200 [MHz]
    DDR 800 [MHz]
    DRAM: 1 GiB
    Comphy chip #0:
    Comphy-0: USB3 5 Gbps
    Comphy-1: PEX0 2.5 Gbps
    Comphy-2: SATA0 6 Gbps
    SATA link 0 timeout.
    AHCI 0001.0300 32 slots 1 ports 6 Gbps 0x1 impl SATA mode
    flags: ncq led only pmp fbss pio slum part sxs
    PCIE-0: Link down
    MMC: sdhci@d0000: 0, sdhci@d8000: 1
    Loading Environment from SPI Flash… SF: Detected mx25u3235f with page size 256 Bytes, erase size 64 KiB, total 4 MiB
    OK
    Model: Marvell Armada 3720 Community Board ESPRESSOBin
    Net: eth0: neta@30000 [PRIME]
    Hit any key to stop autoboot: 0
    switch to partitions #0, OK
    mmc0 is current device
    7391240 bytes read in 325 ms (21.7 MiB/s)
    7885 bytes read in 9 ms (855.5 KiB/s)
    ## Flattened Device Tree blob at 06f00000
    Booting using the fdt blob at 0x6f00000
    Using Device Tree in place at 0000000006f00000, end 0000000006f04ecc

    Starting kernel …

    [ 0.000000] Booting Linux on physical CPU 0x0
    [ 0.000000] Linux version 4.14.95 (buildbot@builds-03.infra.lede-project.org) (gcc version 7.3.0 (OpenWrt GCC 7.3.0 r7627-753531d)) #0 SMP Mon Jan 28 08:54:32 2019

    #38531
    tekrantz
    Participant

    Here is my boot sequence. I use a custom built kernel and a u-boot I got from armbian. I use the same kernel and the same u-boot on my v5 and same kernel with similar u-boot on my gofundme 2GB board.

    TIM-1.0
    WTMI-devel-18.12.1-e6bb176
    WTMI: system early-init
    SVC REV: 4, CPU VDD voltage: 1.120V
    NOTICE: Booting Trusted Firmware
    NOTICE: BL1: v1.5(release):1f8ca7e (Marvell-devel-18.12.2)
    NOTICE: BL1: Built : 16:26:25, May 21 2019
    NOTICE: BL1: Booting BL2
    NOTICE: BL2: v1.5(release):1f8ca7e (Marvell-devel-18.12.2)
    NOTICE: BL2: Built : 16:26:26, May 21 2019
    NOTICE: BL1: Booting BL31
    NOTICE: BL31: v1.5(release):1f8ca7e (Marvell-devel-18.12.2)
    NOTICE: BL31: Built : 16:2

    U-Boot 2018.03-devel-18.12.3-gc9aa92c-armbian (Feb 20 2019 – 09:45:04 +0100)

    Model: Marvell Armada 3720 Community Board ESPRESSOBin
    CPU 1000 [MHz]
    L2 800 [MHz]
    TClock 200 [MHz]
    DDR 800 [MHz]
    DRAM: 1 GiB
    Comphy chip #0:
    Comphy-0: USB3 5 Gbps
    Comphy-1: PEX0 2.5 Gbps
    Comphy-2: SATA0 6 Gbps
    Target spinup took 0 ms.
    AHCI 0001.0300 32 slots 1 ports 6 Gbps 0x1 impl SATA mode
    flags: ncq led only pmp fbss pio slum part sxs
    PCIE-0: Link down
    MMC: sdhci@d0000: 0, sdhci@d8000: 1
    Loading Environment from SPI Flash… SF: Detected mx25u3235f with page size 256 Bytes, erase size 64 KiB, total 4 MiB
    OK
    Model: Marvell Armada 3720 Community Board ESPRESSOBin
    Net: eth0: neta@30000 [PRIME]
    Hit any key to stop autoboot: 0
    switch to partitions #0, OK
    mmc0 is current device
    24535552 bytes read in 21316 ms (1.1 MiB/s)
    10590 bytes read in 8 ms (1.3 MiB/s)
    ## Flattened Device Tree blob at 06000000
    Booting using the fdt blob at 0x6000000
    Using Device Tree in place at 0000000006000000, end 000000000600595d

    Starting kernel …

    [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd034]
    [ 0.000000] Linux version 5.3.0 (root@eb.foo.com) (gcc version 9.2.1 20190827 (Red Hat 9.2.1-1) (GCC)) #1 SMP Mon Sep 16 09:23:43 EDT 2019

    #38532
    mu-b
    Participant

    I’m currently building my own TIM/u-boot with debugging, etc, turned on in verbose mode.

    I’m currently working off of the following:
    1 – this is some kind of heat related issue
    2 – this is some kind of dram training issue

    Reasons for the above:

    1 – the crashes seem to be far more numerable when the device is under heavy load and hot, I’m currently running tests on this with a custom TIM/u-boot build (WTMI-devel-18.12.0-a0a1cb8/U-Boot 2018.03-devel-18.12.3-gc9aa92ce70 (Oct 02 2019 – 08:26:57 +0100).

    2 – the crashes are a dram timing issue, I make this guess because a build of the WTMI-devel-18.12.1-e6bb176 randomly fails to get past dram training and indeed crashes during it. The changes between WTMI-devel-18.12.1-e6bb176 and WTMI-devel-18.12.0-a0a1cb8 relate directly to moving dram training to the byte level.

    3 – perhaps the AVS VDD voltage is slightly too low, I’m going to increase this to the next VDD voltage and give that a try.

    I’ll label these builds and make them available if anyone wants them!

    #38534
    mu-b
    Participant

    Well I’ve recompiled many a firmware with various modifications and still no closer to absolute stability. https://forum.netgate.com/topic/144636/sg-1100-intermittent-reboots does seem right though for a certain number of these boards, I’m suspecting that my device is one of those affected since the reboots seem to come in blocks.

    #38536
    mu-b
    Participant

    Well I’ve fixed my instability issue. The root cause was the same as https://forum.netgate.com/topic/144636/sg-1100-intermittent-reboots, and it was power related. The fix was to replace the 470uf/16V capacitor (component EC1 on the schematic) sat immediately behind the 12V DC jack which under testing leaked under certain temperature conditions (variations). As such the board is now stable and has been running OpenWRT 18.06.2 for 24 hours with memtester and stress running continuous during that entire time.

    I’ll leave it a few more days, after that I’m going to assume that its fixed and that is what NetGate were referring too when they mention ‘a power related component’.

    #38597
    provod
    Participant

    @mu-b, again, thank you for your effort! You’ve mentioned in this thread https://forum.armbian.com/topic/10429-how-to-make-espressobin-v7-stable/page/2/ that replacing the capacitor did not ultimately fixed the issue, right?

    Has there been any new developments since?

    I’m about to return the board, because it essentially became unusable.

    #38611
    barish
    Participant

    I wonder what is the percentage of failing/unstable V7 boards… It’s no option at all to resell boards built into a decent quality product if there’s a 1% probability of unstable boards… Unfortunately, the communication on the Globalscale side is not abundant. In forums like here, at armbian.com and aforementioned netgate.com, there is never an official statement of the manufacturer.

    #38617
    mu-b
    Participant

    @provod yes, it rebooted a few more times. However recently, that is for the last month, the device had been up and running solid, uptime currently 31 days.

    I’m going to pull the console logs from the machine and grab the DDR settings, I rebuilt WTMI to verbose print the output for the DDR then recompile TIM/WTMI and hardcode those DDR training values since really the only thing that varies between reboots of the device is the DDR training values. That should on reboot put the device back into the same state as it currently is.

Viewing 10 posts - 1 through 10 (of 10 total)
  • You must be logged in to reply to this topic.
Signup to our newsletter

Technical specification tables can not be displayed on mobile. Please view on desktop