heartbeat is stopped for some reason

heartbeat is stopped for some reason

Anyway hnode2 was active and the services are running fine but I see heartbeat has been stopped somehow.

Here is the last log I see of heartbeat:

[quote:23c84415f5]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 9/1762471 ms age 0 [pid16738/MST_CONTROL]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 716/51784021 152624/74519 [pid16738/MST_CONTROL]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 200276 total malloc bytes. pid [16738/MST_CONTROL]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/14 ms age 405180540 [pid16741/HBFIFO]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 321/581 30772/13815 [pid16741/HBFIFO]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 32600 total malloc bytes. pid [16741/HBFIFO]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/0 ms age 603373810 [pid16742/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 340/657021 33264/15511 [pid16742/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 42008 total malloc bytes. pid [16742/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/0 ms age 603373820 [pid16743/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 340/394 25136/11458 [pid16743/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 25220 total malloc bytes. pid [16743/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/0 ms age 603373820 [pid16744/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 352/657052 34784/16543 [pid16744/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 43528 total malloc bytes. pid [16744/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/0 ms age 603373820 [pid16745/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 353/1244439 34868/16587 [pid16745/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 35812 total malloc bytes. pid [16745/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/0 ms age 603373820 [pid16746/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 364/657082 36304/17575 [pid16746/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 44840 total malloc bytes. pid [16746/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/0 ms age 603373830 [pid16747/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 364/454 28176/13522 [pid16747/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 36472 total malloc bytes. pid [16747/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/0 ms age 603373850 [pid16748/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 376/657112 37824/18607 [pid16748/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 46360 total malloc bytes. pid [16748/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/0 ms age 603373850 [pid16749/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 376/484 29696/14554 [pid16749/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 37992 total malloc bytes. pid [16749/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/1140417 ms age 40 [pid16750/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 388/30411871 39344/19639 [pid16750/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 51588 total malloc bytes. pid [16750/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/518348 ms age 30 [pid16751/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 389/10885817 39428/19683 [pid16751/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 40928 total malloc bytes. pid [16751/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: These are nothing to worry about.
[/quote:23c84415f5]


Now when I start it again I get these messages (I've got these before). What alternative do I have? The services were running on hnode2 I can't just unmount and stop OpenVZ in a production environment.

I also never stopped heartbeat, it was obviously fine until early this evening 5:12 PM and between now 11:48PM
[quote:23c84415f5]Starting High-Availability services:
2008/09/09_23:45:21 CRITICAL: Resource drbddisk::r0 is active, and should not be!
2008/09/09_23:45:21 CRITICAL: Non-idle resources can affect data integrity!
2008/09/09_23:45:21 info: If you don't know what this means, then get help!
2008/09/09_23:45:21 info: Read the docs and/or source to /usr/share/heartbeat/ResourceManager for more details.
CRITICAL: Resource drbddisk::r0 is active, and should not be!
CRITICAL: Non-idle resources can affect data integrity!
info: If you don't know what this means, then get help!
info: Read the docs and/or the source to /usr/share/heartbeat/ResourceManager for more details.
2008/09/09_23:45:21 CRITICAL: Non-idle resources will affect resource takeback!
2008/09/09_23:45:21 CRITICAL: Non-idle resources may affect data integrity!
[ OK ]
[/quote:23c84415f5]


 

hnode1 complains after restarting heartbeat:

[quote:d10093f29f]Sep 9 23:44:50 hnode1 heartbeat: [31055]: info: Heartbeat restart on node hnode2.ca
Sep 9 23:44:50 hnode1 heartbeat: [31055]: info: Status update for node hnode2.ca: status init
Sep 9 23:44:50 hnode1 heartbeat: [31055]: info: Status update for node hnode2.ca: status up
Sep 9 23:44:51 hnode1 heartbeat: [31055]: info: Status update for node hnode2.ca: status active
Sep 9 23:44:51 hnode1 heartbeat: [31055]: ERROR: should_drop_message: attempted replay attack [hnode2.ca]? [gen = 1220406280, curgen = 1220406281]
Sep 9 23:44:51 hnode1 heartbeat: [31055]: info: remote resource transition completed.
Sep 9 23:44:51 hnode1 heartbeat: [31055]: ERROR: No one owns our local resources!
Sep 9 23:44:51 hnode1 heartbeat: [31055]: ERROR: No one owns our local resources!
Sep 9 23:44:51 hnode1 heartbeat: [31055]: ERROR: should_drop_message: attempted replay attack [hnode2.ca]? [gen = 1220406280, curgen = 1220406281]
Sep 9 23:44:55 hnode1 heartbeat: [31055]: ERROR: should_drop_message: attempted replay attack [hnode2.ca]? [gen = 1220406280, curgen = 1220406281][/quote:d10093f29f]


 

hnode2 shortly after 5:17PM I installed and ran tiobench, I wonder if that did it?

Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: These are nothing to worry about.
Sep 9 17:18:05 hnode2 python: gethostby*.getanswer: asked for "apt.sw.be IN AAAA", got type "SOA"
[b:04bcff7251]Sep 9 20:18:17 hnode2 yum: Installed: tiobench - 0.3.3-1.2.el5.rf.i386
[/b:04bcff7251]


 

CRITICAL: Resource drbddisk::r0 is active, and should not be!
CRITICAL: Non-idle resources can affect data integrity!
info: If you don't know what this means, then get help!
info: Read the docs and/or the source to /usr/share/heartbeat/ResourceManager for more details.
2008/09/09_23:45:21 CRITICAL: Non-idle resources will affect resource takeback!
2008/09/09_23:45:21 CRITICAL: Non-idle resources may affect data integrity!

[quote:a0972a9e65] # What this means is that if you have a shared disk and it's already mounted
# before you start heartbeat, then you could have it mounted simultaneously
# on both sides. If this happens then your disk data is toast!
# So, this is sometimes VERY BAD INDEED!
#
[/quote:a0972a9e65]


 

I ran tiobench again and heartbeat never died


 

Worse of all I checked the logs on hnode1 and it never seemed to realize hnode2 heartbeat was down.



Tags:

heartbeat, reasonheartbeat, hnode, active, sep, info, msg, stats, pid, mst_control, cl_malloc, realmalloc, malloc, bytes, arena, hbfifo, hbwrite, hbread, ve, unmount, openvz, pm, availability, _, resource, drbddisk, idle, docs, usr, resourcemanager, takeback, ok, complains, restarting, restart, node, ca, update, init, should_drop_message, replay, gen, curgen, transition, completed, installed, tiobench, python, gethostby, getanswer, quot, apt, sw, aaaa, soa, bcff, yum, rf, disk, mounted, simultaneously, logs,

Latest Articles

  • How To Force Flash an AMD Instinct GPU To Another Model Using Debian Ubuntu Mint Linux
  • How To compile ollama from source to use unsupported AMD GPU with rocm in Ubuntu Debian
  • QEMU KVM Virtio GPU Windows Cannot Select 1080P
  • Linux Gnome Desktop Ubuntu Mint Debian Gets Slower After Weeks
  • Firefox How to Save Full Page As Screenshot/PDF
  • Nvidia Datacenter Driver Tesla Slow nvidia-smi response and high utilization with 0 usage
  • ffmpeg how to normalize / increase the volume of your audio
  • kdenlive audio blips pops cracks artifacts solution fix
  • haproxy / nginx certbot SSL issues
  • nginx how to see the real IP when behind a CDN
  • Docker how to find real container child process ID
  • Alibaba Aliyun how to reset password solution 'Setup does not meet the requirements, please resetting'
  • RTL88X Series 80Mhz hostapd mode for Linux Debian Kali
  • How To Deploy Your Own Mastodon Server in Docker
  • ffmpeg burning subtitles in non-English errors [Parsed_subtitles_0 @ 0x561d3a0b3b80] Glyph 0x6709 not found, selecting one more font for (Sans, 700, 0)
  • rsyslog in container config
  • Interesting Whisper AI CPU vs GPU Test
  • How to install pytorch with cuda capability for AI acceleration with Nvidia Tesla etc.. GPUs
  • How to Spider the web archive.org to recover your old website/webpage
  • Debian 10 /etc/apt/sources.list