heartbeat is stopped for some reason

heartbeat is stopped for some reason

Anyway hnode2 was active and the services are running fine but I see heartbeat has been stopped somehow.

Here is the last log I see of heartbeat:

[quote:23c84415f5]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 9/1762471 ms age 0 [pid16738/MST_CONTROL]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 716/51784021 152624/74519 [pid16738/MST_CONTROL]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 200276 total malloc bytes. pid [16738/MST_CONTROL]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/14 ms age 405180540 [pid16741/HBFIFO]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 321/581 30772/13815 [pid16741/HBFIFO]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 32600 total malloc bytes. pid [16741/HBFIFO]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/0 ms age 603373810 [pid16742/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 340/657021 33264/15511 [pid16742/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 42008 total malloc bytes. pid [16742/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/0 ms age 603373820 [pid16743/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 340/394 25136/11458 [pid16743/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 25220 total malloc bytes. pid [16743/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/0 ms age 603373820 [pid16744/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 352/657052 34784/16543 [pid16744/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 43528 total malloc bytes. pid [16744/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/0 ms age 603373820 [pid16745/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 353/1244439 34868/16587 [pid16745/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 35812 total malloc bytes. pid [16745/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/0 ms age 603373820 [pid16746/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 364/657082 36304/17575 [pid16746/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 44840 total malloc bytes. pid [16746/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/0 ms age 603373830 [pid16747/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 364/454 28176/13522 [pid16747/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 36472 total malloc bytes. pid [16747/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/0 ms age 603373850 [pid16748/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 376/657112 37824/18607 [pid16748/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 46360 total malloc bytes. pid [16748/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/0 ms age 603373850 [pid16749/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 376/484 29696/14554 [pid16749/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 37992 total malloc bytes. pid [16749/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/1140417 ms age 40 [pid16750/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 388/30411871 39344/19639 [pid16750/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 51588 total malloc bytes. pid [16750/HBWRITE]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 0/518348 ms age 30 [pid16751/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 389/10885817 39428/19683 [pid16751/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: RealMalloc stats: 40928 total malloc bytes. pid [16751/HBREAD]
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: Current arena value: 0
Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: These are nothing to worry about.
[/quote:23c84415f5]


Now when I start it again I get these messages (I've got these before). What alternative do I have? The services were running on hnode2 I can't just unmount and stop OpenVZ in a production environment.

I also never stopped heartbeat, it was obviously fine until early this evening 5:12 PM and between now 11:48PM
[quote:23c84415f5]Starting High-Availability services:
2008/09/09_23:45:21 CRITICAL: Resource drbddisk::r0 is active, and should not be!
2008/09/09_23:45:21 CRITICAL: Non-idle resources can affect data integrity!
2008/09/09_23:45:21 info: If you don't know what this means, then get help!
2008/09/09_23:45:21 info: Read the docs and/or source to /usr/share/heartbeat/ResourceManager for more details.
CRITICAL: Resource drbddisk::r0 is active, and should not be!
CRITICAL: Non-idle resources can affect data integrity!
info: If you don't know what this means, then get help!
info: Read the docs and/or the source to /usr/share/heartbeat/ResourceManager for more details.
2008/09/09_23:45:21 CRITICAL: Non-idle resources will affect resource takeback!
2008/09/09_23:45:21 CRITICAL: Non-idle resources may affect data integrity!
[ OK ]
[/quote:23c84415f5]


 

hnode1 complains after restarting heartbeat:

[quote:d10093f29f]Sep 9 23:44:50 hnode1 heartbeat: [31055]: info: Heartbeat restart on node hnode2.ca
Sep 9 23:44:50 hnode1 heartbeat: [31055]: info: Status update for node hnode2.ca: status init
Sep 9 23:44:50 hnode1 heartbeat: [31055]: info: Status update for node hnode2.ca: status up
Sep 9 23:44:51 hnode1 heartbeat: [31055]: info: Status update for node hnode2.ca: status active
Sep 9 23:44:51 hnode1 heartbeat: [31055]: ERROR: should_drop_message: attempted replay attack [hnode2.ca]? [gen = 1220406280, curgen = 1220406281]
Sep 9 23:44:51 hnode1 heartbeat: [31055]: info: remote resource transition completed.
Sep 9 23:44:51 hnode1 heartbeat: [31055]: ERROR: No one owns our local resources!
Sep 9 23:44:51 hnode1 heartbeat: [31055]: ERROR: No one owns our local resources!
Sep 9 23:44:51 hnode1 heartbeat: [31055]: ERROR: should_drop_message: attempted replay attack [hnode2.ca]? [gen = 1220406280, curgen = 1220406281]
Sep 9 23:44:55 hnode1 heartbeat: [31055]: ERROR: should_drop_message: attempted replay attack [hnode2.ca]? [gen = 1220406280, curgen = 1220406281][/quote:d10093f29f]


 

hnode2 shortly after 5:17PM I installed and ran tiobench, I wonder if that did it?

Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: These are nothing to worry about.
Sep 9 17:18:05 hnode2 python: gethostby*.getanswer: asked for "apt.sw.be IN AAAA", got type "SOA"
[b:04bcff7251]Sep 9 20:18:17 hnode2 yum: Installed: tiobench - 0.3.3-1.2.el5.rf.i386
[/b:04bcff7251]


 

CRITICAL: Resource drbddisk::r0 is active, and should not be!
CRITICAL: Non-idle resources can affect data integrity!
info: If you don't know what this means, then get help!
info: Read the docs and/or the source to /usr/share/heartbeat/ResourceManager for more details.
2008/09/09_23:45:21 CRITICAL: Non-idle resources will affect resource takeback!
2008/09/09_23:45:21 CRITICAL: Non-idle resources may affect data integrity!

[quote:a0972a9e65] # What this means is that if you have a shared disk and it's already mounted
# before you start heartbeat, then you could have it mounted simultaneously
# on both sides. If this happens then your disk data is toast!
# So, this is sometimes VERY BAD INDEED!
#
[/quote:a0972a9e65]


 

I ran tiobench again and heartbeat never died


 

Worse of all I checked the logs on hnode1 and it never seemed to realize hnode2 heartbeat was down.



Tags:

heartbeat, reasonheartbeat, hnode, active, sep, info, msg, stats, pid, mst_control, cl_malloc, realmalloc, malloc, bytes, arena, hbfifo, hbwrite, hbread, ve, unmount, openvz, pm, availability, _, resource, drbddisk, idle, docs, usr, resourcemanager, takeback, ok, complains, restarting, restart, node, ca, update, init, should_drop_message, replay, gen, curgen, transition, completed, installed, tiobench, python, gethostby, getanswer, quot, apt, sw, aaaa, soa, bcff, yum, rf, disk, mounted, simultaneously, logs,

Latest Articles

  • How to allow SSH root user access in Linux/Debian/Mint/RHEL/Ubuntu/CentOS
  • Ansible Tutorial - Playbook How To Install From Scratch and Deploy LAMP + Wordpress on Remote Server
  • Ceph Install Errors on Proxmox / How To Fix Solution
  • Proxmox Update Error https://enterprise.proxmox.com/debian/pve bullseye InRelease 401 Unauthorized [IP: 144.217.225.162 443]
  • QEMU/KVM How to Hot-add A Virtual Disk .raw/.qcow2 via QEMU Monitor Commands
  • Proxmox How To Enable Ceph Distributed Storage Cluster with OSD and Pools
  • pulseaudio issue on QEMU/KVM guest VM when microphone is replugged/unplugged pulseaudio: pa_threaded_mainloop_lock failed pulseaudio: Reason: Invalid argument
  • Ubuntu Linux Mint - Volume Control Stopped Working
  • Proxmox Services Won't Start Failed to start The Proxmox VE cluster filesystem. Proxmox VE firewall. PVE Status Daemon. Proxmox VE scheduler. PVE Cluster HA Resource Manager Daemon. PVE Local HA Resource Manager Daemon.
  • Proxmox Guide FAQ / Errors / Howto
  • Virtualbox Vbox Issue Cannot Enable Nested Virtualization Button is Grayed/Greyed Out and Unclickable HowTo Solution
  • Virtualbox VBOX Howto Port Forward To Guests
  • Linux Ubuntu Debian Centos Mint - How To Check if Intel VT-x or AMD-V Hardware Virtualization is Enabled?
  • Linux Howto Zip Multiple Files and Directories
  • Windows Cannot Format USB drive Device Media is Write Protected Error Solution
  • Linux Mint 20 cannot install snapd missing solution
  • Virtualbox VBOX How To Install Guest-Utils/GuestUtils so drag and drop and clipboard works Ubuntu Mint Debian Linux
  • How to install Kubernetes with microk8s and deploy apps on Debian/Mint/Ubuntu Linux
  • vi how to delete everything to the end of the line or the rest of the line from the cursor
  • Cisco Howto Configure Console Port/Terminal/Comm Server with Async Cable Setup